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PREFACE 

See also http://www.wiley.com/college/kreyszig/ 


Goal of the Book. Arrangement of Material 

This new edition continues the tradition of providing instructors and students with a 
comprehensive and up-to-date resource for teaching and learning engineering 
mathematics, that is, applied mathematics for engineers and physicists, mathematicians 
and computer scientists, as well as members of other disciplines. A course in elementary 
calculus is the sole prerequisite . 

The subject matter is arranged into seven parts A-G: 

A Ordinary Differential Equations (ODEs) (Chaps. 1-6) 

B Linear Algebra. Vector Calculus (Chaps. 7-9) 

C Fourier Analysis. Partial Differential Equations (PDEs) (Chaps. 11-12) 

D Complex Analysis (Chaps. 13-18) 

E Numeric Analysis (Chaps. 19-21) 

F Optimization, Graphs (Chaps. 22-23) 

G Probability, Statistics (Chaps. 24-25). 

This is followed by five appendices: 

App. 1 References (ordered by parts) 

App. 2 Answers to Odd-Numbered Problems 
App. 3 Auxiliary Material (see also inside covers) 

App. 4 Additional Proofs 
App. 5 Tables of Functions. 

This book has helped to pave the way for the present development of engineering 
mathematics. By a modem approach to those areas A-G, this new edition will prepare 
the student for the tasks of the present and of the future. The latter can be predicted to 
some extent by a judicious look at the present trend. Among other features, this trend 
shows the appearance of more complex production processes, more extreme physical 
conditions (in space travel, high-speed communication, etc.), and new tasks in robotics 
and communication systems (e.g., fiber optics and scan statistics on random graphs) and 
elsewhere. This requires the refinement of existing methods and the creation of new ones. 

It follows that students need solid knowledge of basic principles, methods, and results, 
and a clear view of what engineering mathematics is all about, and that it requires 
proficiency in all three phases of problem solving: 

• Modeling, that is, translating a physical or other problem into a mathematical form, 
into a mathematical model; this can be an algebraic equation, a differential equation, 
a graph, or some other mathematical expression. 

• Solving the model by selecting and applying a suitable mathematical method, often 
requiring numeric work on a computer. 

• Interpreting the mathematical result in physical or other terms to see what it 
practically means and implies. 

It would make no sense to overload students with all kinds of little things that might be of 
occasional use. Instead they should recognize that mathematics rests on relatively few basic 
concepts and involves powerful unifying principles. This should give them a firm grasp on 
the interrelations among theory , computing, and (physical or other) experimentation . 


V 
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General Features of the Book Include: 

• Simplicity of examples, to make the book teachable — why choose complicated 
examples when simple ones are as instructive or even better? 

• Independence of chapters, to provide flexibility in tailoring courses to special needs. 

• Self-contained presentation, except for a few clearly marked places where a proof 
would exceed the level of the book and a reference is given instead. 

• Modern standard notation, to help students with other courses, modern books, and 
mathematical and engineering journals. 

Many sections were rewritten in a more detailed fashion, to make it a simpler book. This 
also resulted in a better balance between theory and applications. 

Use of Computers 

The presentation is adaptable to various levels of technology and use of a computer or 
graphing calculator: very little or no use, medium use, or intensive use of a graphing 
calculator or of an unspecified CAS (Computer Algebra System, Maple , Mathematical 
or Matlab being popular examples). In either case texts and problem sets form an entity 
without gaps or jumps. And many problems can be solved by hand or with a computer 
or both ways. (For software y see the beginnings of Pan E on Numeric Analysis and Part G 
on Probability and Statistics.) 

More specifically, this new edition on the one hand gives more prominence to tasks 
the computer cannot do, notably, modeling and interpreting results. On the other hand, it 
includes CAS projects , CAS problems , and CAS experiments, which do require a 
computer and show its power in solving problems that are difficult or impossible to access 
otherwise. Here our goal is the combination of intelligent computer use with high-quality 
mathematics. This has resulted in a change from a formula-centered teaching and learning 
of engineering mathematics to a more quantitative, project-oriented, and visual approach. 
CAS experiments also exhibit the computer as an instrument for observations and 
experimentations that may become the beginnings of new research, for “proving” or 
disproving conjectures, or for formalizing empirical relationships that are often quite useful 
to the engineer as working guidelines. These changes will also help the student in 
discovering the experimental aspect of modern applied mathematics . 

Some routine and drill work is retained as a necessity for keeping firm contact with 
the subject matter. In some of it the computer can (but must not) give the student a hand, 
but there are plenty of problems that are more suitable for pencil-and-paper work. 

Major Changes 

1. New Problem Sets. Modern engineering mathematics is mostly teamwork. It usually 
combines analytic work in the process of modeling and the use of computer algebra and 
numerics in the process of solution, followed by critical evaluation of results. Our 
problems — some straightforward, some more challenging, some “thinking problems” not 
accessible by a CAS, some open-ended — reflect this modern situation with its increased 
emphasis on qualitative methods and applications, and the problem sets take care of this 
novel situation by including team projects, CAS projects, and writing projects. The latter 
will also help the student in writing general reports, as they are required in engineering 
work quite frequently. 

2. Computer Experiments, using the computer as an instrument of “experimental 
mathematics” for exploration and research (see also above). These are mostly open-ended 
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experiments, demonstrating the use of computers in experimentally finding results, which 
may be provable afterward or may be valuable heuristic qualitative guidelines to the 
engineer, in particular in complicated problems. 

3. More on modeling and selecting methods, tasks that usually cannot be automated. 

4. Student Solutions Manual and Study Guide enlarged, upon explicit requests 
of the users. This Manual contains worked-out solutions to carefully selected odd-numbered 
problems (to which App. 1 gives only the final answers) as well as general comments 
and hints on studying the text and working further problems, including explanations on 
the significance and character of concepts and methods in the various sections of the 
book. 


Further Changes, New Features 

• Electric circuits moved entirely to Chap. 2, to avoid duplication and repetition 

• Second-order ODEs and Higher Order ODEs placed into two separate chapters 
(2 and 3) 

• In Chap. 2, applications presented before variation of parameters 

• Series solutions somewhat shortened, without changing the order of sections 

• Material on Laplace transforms brought into a better logical order: partial fractions 
used earlier in a more practical approach, unit step and Dirac’s delta put into separate 
subsequent sections, differentiation and integration of transforms (not of functions!) 
moved to a later section in favor of practically more important topics 

• Second- and third-order determinants made into a separate section for reference 
throughout the book 

• Complex matrices made optional 

• Three sections on curves and their application in mechanics combined in a single section 

• First two sections on Fourier series combined to provide a better, more direct start 

• Discrete and Fast Fourier Transforms included 

• Conformal mapping presented in a separate chapter and enlarged 

• Numeric analysis updated 

• Backward Euler method included 

• Stiffness of ODEs and systems discussed 

• List of software (in Part E) updated; another list for statistics software added (in Part G) 

• References updated, now including about 75 books published or reprinted after 1990 


Suggestions for Courses: A Four-Semester Sequence 

The material, when taken in sequence, is suitable for four consecutive semester courses, 
meeting 3-4 hours a week: 


1 st Semester. 
2nd Semester. 
3rd Semester. 
4th Semester. 


ODEs (Chaps. 1-5 or 6) 

Linear Algebra, Vector Analysis (Chaps. 7-10) 
Complex Analysis (Chaps. 13-18) 

Numeric Methods (Chaps. 19-21) 
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Suggestions for Independent One-Semester Courses 

The book is also suitable for various independent one-semester courses meeting 3 hours 
a week. For instance: 

Introduction to ODEs (Chaps. 1-2, Sec. 21.1) 

Laplace Transforms (Chap. 6) 

Matrices and Linear Systems (Chaps. 7-8) 

Vector Algebra and Calculus (Chaps. 9-10) 

Fourier Series and PDEs (Chaps. 11-12, Secs. 21.4—21.7) 

Introduction to Complex Analysis (Chaps. 13-17) 

Numeric Analysis (Chaps. 19, 21) 

Numeric Linear Algebra (Chap. 20) 

Optimization (Chaps. 22-23) 

Graphs and Combinatorial Optimization (Chap. 23) 

Probability and Statistics (Chaps. 24-25) 
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PART A 

Ordinary 
Differential 
Equations (ODEs) 


CHAPTER 1 
CHAPTER 2 
CHAPTER 3 
CHAPTER 4 
CHAPTER 5 
CHAPTER 6 


First-Order ODEs 
Second-Order Linear ODEs 
Higher Order Linear ODEs 

Systems of ODEs. Phase Plane. Qualitative Methods 
Series Solutions of ODEs. Special Functions 
Laplace Transforms 


Differential equations are of basic importance in engineering mathematics because many 
physical laws and relations appear mathematically in the form of a differential equation. 
In Part A we shall consider various physical and geometric problems that lead to 
differential equations, with emphasis on modeling , that is, the transition from the physical 
situation to a “mathematical model.” In this chapter the model will be a differential 
equation, and as we proceed we shall explain the most important standard methods for 
solving such equations. 


Part A concerns ordinary differential equations (ODEs), whose unknown functions 
depend on a single variable. Partial differential equations (PDEs), involving unknown 
functions of several variables, follow in Part C. 


ODEs are very well suited for computers. Numeric methods for ODEs can be studied 
directly after Chaps . 1 or 2. See Secs. 21.1-21.3, which are independent of the other 
sections on numerics. 


1 


CHAPTER 1 

First-Order ODEs 


In this chapter we begin our program of studying ordinary differential equations (ODEs) 
by deriving them from physical or other problems (modeling), solving them by standard 
methods, and interpreting solutions and their graphs in terms of a given problem. Questions 
of existence and uniqueness of solutions will also be discussed (in Sec. 1.7). 

We begin with the simplest ODEs, called ODEs of the first order because they involve 
only the first derivative of the unknown function, no higher derivatives. Our usual 
notation for the unknown function will be y(jc), or y(t) if the independent variable is 
time /. 

If you wish, use your computer algebra system (CAS) for checking solutions, but make 
sure that you gain a conceptual understanding of the basic terms, such as ODE, direction 
field, and initial value problem. 

COMMENT. Numerics for first-order ODEs can be studied immediately after this 
chapter. See Secs. 21.1-21.2, which are independent of other sections on numerics. 

Prerequisite: Integral calculus. 

Sections that may be omitted in a shorter course: 1.6, 1.7. 

References and Answers to Problems: App. 1 Part A, and App. 2 


Basic Concepts. Modeling 

If we want to solve an engineering problem (usually of a physical nature), we first have 
to formulate the problem as a mathematical expression in terms of variables, functions, 
equations, and so forth. Such an expression is known as a mathematical model of die 
given problem. The process of setting up a model, solving it mathematically, and 
interpreting the result in physical or other terms is called mathematical modeling or, briefly, 
modeling. We shall illustrate this process by various examples and problems because 
modeling requires experience. (Your computer may help you in solving but hardly in 
setting up models.) 

Since many physical concepts, such as velocity and acceleration, are derivatives, a 
model is very often an equation containing derivatives of an unknown function. Such 
a model is called a differential equation. Of course, we then want to find a solution 
(a function that satisfies the equation), explore its properties, graph it, find values of it, 
and interpret it in physical terms so that we can understand the behavior of the physical 
system in our given problem. However, before we can turn to methods of solution we 
must first define basic concepts needed throughout this chapter. 


SEC 1.1 Basic Concepts. Modeling 



Falling stone 

y" =g = const. 
(Sec. 1.1) 




Parachutist 

mv' - mg-bv 2 
(Sec. 1.2) 


Water level h 

Outflowing water 

(Sec. 1.3) 


U y 


Displacement y 

Vibrating mass 
on a spring 
my" + ky - 0 
(Secs. 2.4, 2.8) 


/ >\ / h. 


\/ \ 


/ / 

\u y Mi 1 / 


Beats of a vibrating 
system 

y" + (o*y = cos a )t, a) 0 = co 
(Sec. 2.8) 


Current I in an 
RLC circuit 

Ll" + RI' + -i-i = E“ 

(Sec. 2.9) 



Deformation of a beam 

EIy h = fix) 

(Sec. 3.3) 



Pendulum 

Le"+£Sin0 = 0 

(Sec. 4.5) 



- 1 


Lotka-Volterra 
predator-prey model 

y' 2 =h i y 2 - l y 2 
(Sec. 4.5) 


Fig. 1. Some applications of differential equations 
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CHAP. 1 First-Order ODEs 


An ordinary differential equation (ODE) is an equation that contains one or several 
derivatives of an unknown function, which we usually call y(x ) (or sometimes y(/) if the 
independent variable is time t). The equation may also contain y itself, known functions 
of a* (or r), and constants. For example, 

(1) y f = cos a*, 

(2) / + 9y = 0, 

(3) a* 2 /'/ + 2 e x y" = (x 2 + 2 )y 2 


are ordinary differential equations (ODEs). The term ordinary distinguishes them from 
partial differential equations (PDEs), which involve partial derivatives of an unknown 
function of two or more variables. For instance, a PDE with unknown function u of two 
variables x and y is 


d 2 u 

a? 


+ 


c fu 
dy 2 


= 0 . 


PDEs are more complicated than ODEs; they will be considered in Chap. 12. 

An ODE is said to be of order n if the nth derivative of the unknown function y is the 
highest derivative of v in the equation. The concept of order gives a useful classification 
into ODEs of first order, second order, and so on. Thus, (1) is of first order, (2) of second 
order, and (3) of third order. 

In this chapter we shall consider first-order ODEs. Such equations contain only the 
first derivative y and may contain y and any given functions of x. Hence we can write 
them as 


(4) 


Fix. y. /) = 0 


or often in the form 


/ = fix, yf 

This is called the explicit form, in contrast with the implicit form. (4). For instance, the 
implicit ODE x~ z y f — 4y 2 = 0 (where x # 0) can be written explicitly as y f = 4A' 3 y 2 . 


Concept of Solution 

A function 


y = h(x) 

is called a solution of a given ODE (4) on some open interval a < x < b if hix) is defined 
and differentiable throughout the interval and is such that the equation becomes an identity 
if y and y f are replaced with h and //, respectively. The curve (the graph) of h is called 
a solution curve. 

Here, open interval a < x < b means that the endpoints a and b are not regarded as 
points belonging to the interval. Also, a <x<b includes infinite intervals -^<x< b , 
a < x < <&, — 00 < a* < °° (the real line) as special cases. 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


Verification of Solution 

y = h(x) = c/x (c an arbitrary constant, x # 0) is a solution of xy ' = -y. To verify this, differentiate, 
y' = h'(x) = —c/a* 2 , and multiply by x to get xy — —c/x = —y. Thus, xy' = — y, the given ODE. M 

Solution Curves 

The ODE y = dyldx = cos a* can be solved directly by integration on both sides. Indeed, using calculus, we 
obtain y = / cos a dx = sin a + c, where c is an arbitrary constant. This is a family of solutions. Each value 
of c, for instance, 2.75 or 0 or -8, gives one of these curves. Figure 2 shows some of them, for c = -3, -2, 
-1,0, 1,2, 3.4. ■ 



Exponential Growth, Exponential Decay 

From calculus we know that y = ce 3t (c any constant) has the derivative (chain rule!) 


t 

y 


dy_ 

dx 


- 3 ce 3t = 3y. 


This shows that y is a solution of y = 3y. Hence this ODE can model exponential growth, for instance, of 
animal populations or colonies of bacteria. It also applies to humans for small populations in a large country 
(e.g., the United States in early times) and is then known as Malthus’s law. 1 We shall say more about this topic 
in Sec. 1.5. 

Similarly, y = — 0.2y (with a minus on the right!) has the solution y = ce~°' 2t . Hence this ODE models 
exponential decay, for instance, of a radioactive substance (see Example 5). Figure 3 shows solutions for some 
positive c. Can you find what the solutions look like for negative c? M 



Earned after the English pioneer in classic economics, THOMAS ROBERT MALTHUS (1766-1834). 
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EXAMPLE 4 


We see that each ODE in these examples has a solution that contains an arbitrary constant 
c . Such a solution containing an arbitrary constant c is called a general solution of the 
ODE. 

(We shall see that c is sometimes not completely arbitrary but must be restricted to 
some interval to avoid complex expressions in the solution.) 

We shall develop methods that will give general solutions uniquely (perhaps except for 
notation). Hence we shall say the general solution of a given ODE (instead of a general 
solution). 

Geometrically, the general solution of an ODE is a family of infinitely many solution 
curves, one for each value of the constant c. If we choose a specific c (e.g., c = 6.45 or 
0 or -2.01) we obtain what is called a particular solution of the ODE. A particular 
solution does not contain any arbitrary constants. 

In most cases, general solutions exist, and every solution not containing an arbitrary constant 
is obtained as a particular solution by assigning a suitable value to c. Exceptions to these 
rules occur but are of minor interest in applications; see Prob. 16 in Problem Set 1.1. 


Initial Value Problem 

In most cases the unique solution of a given problem, hence a particular solution, is 
obtained from a general solution by an initial condition y(.v 0 ) = y 0 , with given values 
a * 0 and y 0 , that is used to determine a value of the arbitrary constant c. Geometrically 
this condition means that the solution curve should pass through the point (a 0 , y 0 ) in 
the Ay-plane. An ODE together with an initial condition is called an initial value 
problem. Thus, if the ODE is explicit, y' = /(a, y), the initial value problem is of the 
form 

(5) / = /(*, y), y(* 0 ) = y 0 - 


Initial Value Problem 

Solve the initial value problem 


/ «v 

v' = -i- = 3\\ v(0) = 5.7. 

ax 

Solution . The general solution is y(.v) = ce 3x ; see Example 3. From this solution and the initial condition 
we obtain .v(0) = ce° = c = 5.7. Hence the initial value problem has the solution v(.v) = 5.7e 3 ‘ r . This is a 
particular solution. H 


Modeling 

The general importance of modeling to the engineer and physicist was emphasized at the 
beginning of this section. We shall now consider a basic physical problem that will show 
the typical steps of modeling in detail: Step 1 the transition from the physical situation 
(the physical system) to its mathematical formulation (its mathematical model); Step 2 
the solution by a mathematical method; and Step 3 the physical interpretation of the result. 
This may be the easiest way to obtain a first idea of the nature and purpose of differential 
equations and their applications. Realize at the outset that your computer (your CAS) may 
perhaps give you a hand in Step 2, but Steps 1 and 3 are basically your work. And Step 2 
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EXAMPLE 5 


requires a solid knowledge and good understanding of solution methods available to you — 
you have to choose the method for your work by hand or by the computer. Keep this in 
mind, and always check computer results for errors (which may result, for instance, from 
false inputs). 

Radioactivity. Exponential Decay 

Given an amount of a radioactive substance, say, 0.5 g (gram), find the amount present at any later time. 

Physical Informa lion. Experiments show that at each instant a radioactive substance decomposes at a rate 
proportional to the the amount present. 

Step 1. Setting up a mathematical model (a differential equation ) of the physical process. Denote by y(t) the 
amount of substance still present at any time t. By the physical law, the time rate of change /(/) = dyldt is 
proportional to y(/). Denote the constant of proportionality by k. Then 

dy 

(6) i-* 

The value of k is known from experiments for various radioactive substances (e.g., k = -1.4* 10” n sec~\ 
approximately, for radium 88 Ra 226 ). k is negative because y(t) decreases with time. The given initial amount is 
0.5 g. Denote the corresponding time by r = 0. Then the initial condition is y(0) = 0.5. This is the instant at 
which the process begins; this motivates the term initial condition (which, however, is also used more generally 
when the independent variable is not time or when you choose a t other than t = 0). Hence the model of the 
process is the initial value problem 

(7) y, = *0) = 0.5. 

Step 2. Mathematical solution. As in Example 3 we conclude that the ODE (6) models exponential decay and 
has the general solution (with arbitrary constant c but definite given k) 

(8) ,y(0 = ce kt . 

We now use the initial condition to determine c. Since y(0) = c from (8). this gives _y(0) = c = 0.5. Hence the 
particular solution governing this process is 

(9) y(t) = 0.5e kt (Fig. 4). 

Always check your result — it may involve human or computer errors! Verify by differentiation (chain rule!) 
that your solution (9) satisfies (7) as well as y(0) = 0.5: 

= 0.5 ke ki = k ■ 0.5e kt = ky, y(Q) = 0.5e° = 0.5. 
dt 

Step 3 . Interpretation of result. Formula (9) gives the amount of radioactive substance at time t. It starts from 
the correct given initial amount and decreases with time because k (the constant of proportionality, depending 
on the kind of substance) is negative. The limit of y as / — > » is zero. H 



Fig. 4. Radioactivity (Exponential decay, 
y = 0.5 e kt , with k = —1.5 as an example) 
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EXAMPLE 6 A Geometric Application 

Geometric problems may also lead to initial value problems. For instance. Find the curve through the point 
(1. 1) in the Ay-plane having at each of its points the slope —yfx. 

Solution . The slope y should equal —yfx. This gives the ODE y — -yfx. Its general solution is y = cfx 
(see Example 1). This is a family of hyperbolas with the coordinate axes as asymptotes. 

Now, for the curve to pass through (1, 1), we must have y = 1 when x = 1. Hence the initial condition is 
v(l) = 1. From this condition and y = cl x we get y( 1) = c/1 = 1; that is, c = 1. This gives the particular 
solution y = 1/a (drawn somewhat thicker in Fig. 5). M 



Fig. 5. Solutions of y = -y/x (hyperbolas) F 'S* 6 * Particular solutions and singular 

solution in Problem 16 
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T^l CALCULUS 

Solve the ODE by integration. 

1. y' = —sin ttx 2 . y* = e~ 2x 

3 . y' = xe xZ/2 4. y* = cosh 4a* 

5I9] VERIFICATION OF SOLUTION 

State the order of the ODE. Verify that the given function 
is a solution. («, b , c are arbitrary constants.) 

5. y f = 1 -I- y 2 , v = tan (a + c) 

6. y” + 7r 2 y = 0, y = a cos ttx + b sin m* 

7. y w + 2y* + lOy = 0, y = 4e~ x sin 3a 

8. y' + 2 y = 4 (a* + l) 2 , y = 5(?" 2;c + 2 a* 2 + 2a* + 1 

9. y m — cos a, y = —sin a + ^a 2 -I- bx + c 

10—14 [ INITIAL VALUE PROBLEMS 

Verify that y is a solution of the ODE. Determine from y 
the particular solution satisfying the given initial condition. 
Sketch or graph this solution. 

10. y f = 0.5y, y = ce °’ 5x . y(2) = 2 

11. v' = 1 + 4v 2 , y = \ tan (2 a 4- c), y(0) = 0 

12 . y = y - a, y = c(?* r + a + 1, y(0) = 3 

13. y' + 2Ay = 0, y = ce"* 2 , y(l) = \/e 

14- y f = y tan a, y = c sec a, y(0) = ^77 


15. (Existence) (A) Does the ODE y' 2 = — 1 have a (real) 
solution? 

(B) Does the ODE |y'| + |y| = 0 have a general 
solution? 

16. (Singular solution) An ODE may sometimes have an 
additional solution that cannot be obtained from the 
general solution and is then called a singular solution . 
The ODE y' 2 — x y + y = 0 is of the kind. Show by 
differentiation and substitution that it has the general 
solution y = ca — c 2 and the singular solution y — a 2 /4. 
Explain Fig. 6. 

17-22 1 MODELING, APPLICATIONS 

The following problems will give you a first impression of 
modeling. Many more problems on modeling follow 
throughout this chapter. 

17. (Falling body) If we drop a stone, we can assume air 
resistance (“drag”) to be negligible. Experiments show 
that under that assumption the acceleration y n = d 2 yldt 2 
of this motion is constant (equal to the so-called 
acceleration of gravity g = 9.80 m/sec 2 = 32 ft/sec 2 ). 
State this as an ODE for y(r), the distance fallen as a 
function of time t. Solve the ODE to get the familiar 
law of free fall, v = gt 2 f 2. 
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18. (Falling body) If in Prob. 17 the stone starts at / = 0 
from initial position y 0 with initial velocity v = v Q , 
show that the solution is y = gt 2 ll + i> 0 f + 3’o- How 
long does a fall of 100 m take if the body falls from 
rest? A fall of 200 m? (Guess first.) 

19. (Airplane takeoff) If an airplane has a run of 3 km, 
starts with a speed 6 m/sec, moves with constant 
acceleration, and makes the run in l min, with what 
speed does it take off? 

20. (Subsonic flight) The efficiency of the engines of 
subsonic airplanes depends on air pressure and usually 
is maximum near about 36 000 ft. Find the air pressure 
y(x) at this height without calculation. Physical 
information. The rate of change y{ x) is proportional 
to the pressure, and at 18 000 ft the pressure has 
decreased to half its value y 0 at sea level. 

21. (Half-life) The half-life of a radioactive substance is 
the lime in which half of the given amount disappears. 
Hence it measures the rapidity of the decay. What 


is the half-life of radium 88 Ra 226 0 n years) in 
Example 5? 

22. (Interest rates) Show by algebra that the investment y(t) 
from a deposit y 0 after / years at an interest rate r is 

y a (t) = y 0 [\ + r]‘ (Interest compounded annually) 

y d (r) = Ml + (r/365)] 365 ' 

(Interest compounded daily). 

Recall from calculus that 

[1 + (l//z)] n — » e as n — > ^ ; 

hence [1 + {r/n)] nt — > e n ; thus 

y c (r) = y 0 <? rt (Interest compounded continuously). 

What ODE does the last function satisfy? Let the 
initial investment be $1000 and r = 6%. Compute the 
value of the investment after l year and after 5 years 
using each of the three formulas. Is there much 
difference? 


1.2 Geometric Meaning of y' = /(x, y). 

Direction Fields 

A first-order ODE 

(1) y* = /U\ y) 

has a simple geometric interpretation. From calculus you know that the derivative y\x) 
of y(x) is the slope of y(x). Hence a solution curve of (1) that passes through a point 
(a* 0 , y 0 ) must have at that point the slope y'(x 0 ) equal to the value off at that point; that is, 

y\x o) = fix o ? 3 ; o)- 

Read this paragraph again before you go on, and think about it. 

It follows that you can indicate directions of solution curves of (1) by drawing short 
straight-line segments ( lineal elements) in the Ay-plane (as in Fig. 7a) and then fitting 
(approximate) solution curves through the direction field (or slope field) thus obtained. 
This method is important for two reasons. 

1. You need not solve (1). This is essential because many ODEs have complicated 
solution formulas or none at all. 

2. The method shows, in graphical form, the whole family of solutions and their typical 
properties. The accuracy is somewhat limited, but in most cases this does not matter. 

Let us illustrate this method for the ODE 

y = x y. 


( 2 ) 
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Direction Fields by a CAS (Computer Algebra System). A CAS plots lineal elements 
at the points of a square grid, as in Fig. 7a for (2), into which you can fit solution curves. 
Decrease the mesh size of the grid in regions where /(a, y) varies rapidly. 

Direction Fields by Using Isoclines (the Older Method). Graph the curves 
/(a\ y) — k — const , called isoclines (meaning curves of equal inclination ). For (2) these 
are the hyperbolas f(x 9 y) = xy = k = const (and the coordinate axes) in Fig. 7b. By (1), 
these are the curves along which the derivative y f is constant. These are not yet solution 
curves — don’t get confused. Along each isocline draw many parallel line elements of the 
corresponding slope k. This gives the direction field, into which you can now graph 
approximate solution curves. 

We mention that for the ODE (2) in Fig. 7 we would not need the method, because we 
shall see in the next section that ODEs such as (2) can easily be solved exactly. For the 
time being, let us verify by substitution that (2) has the general solution 

yU) = ce^ /z (c arbitrary). 

Indeed, by differentiation (chain rule!) we get y = A'(ce* 2/2 ) = xy. Of course, knowing 
the solution, we now have the advantage of obtaining a feel for the accuracy of the 
method by comparing with the exact solution. The particular solution in Fig. 7 through 
(.v, y) = (1, 2) must satisfy y ( 1 ) = 2. Thus, 2 = ce m , c = 2 I'Ve = 1.213, and the particular 
solution isy(A*) = 

A famous ODE for which we do need direction fields is 


(3) / =0.1(1 - a- 2 ) - -. 

y 

(It is related to the van der Pol equation of electronics, which we shall discuss in Sec. 4.5.) 
The direction field in Fig. 8 shows lineal elements generated by the computer. We have 
also added the isoclines for k — —5, -3, 1 as well as three typical solution curves, one 

that is (almost) a circle and two spirals approaching it from inside and outside. 




Fig. 7. Direction field of y — xy 
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On Numerics 

Direction fields give “all” solutions, but with limited accuracy. If we need accurate numeric 
values of a solution (or of several solutions) for which we have no formula, we can use 
a numeric method. If you want to get an idea of how these methods work, go to Sec. 
21.1 and study the first two pages on the Euler-Cauchy method, which is typical of 
more accurate methods later in that section, notably of the classical Runge-Kutta method. 
It would make little sense to interrupt the present flow of ideas by including such methods 
here; indeed, it would be a duplication of the material in Sec. 21.1. For an excursion to 
that section you need no extra prerequisites; Sec. 1.1 just discussed is sufficient. 


-■aaawssgaaSB: 


1-10 1 DIRECTION FIELDS, SOLUTION CURVES 

Graph a direction field (by a CAS or by hand). In the field 
graph approximate solution curves through the given point 
or points ( x , y) by hand. 

1 . / = - >% ( 0 , 0 ), ( 0 , 1 ) 


2. 4yy ' = -9a*, (2, 2) 

3. y' = 1 + y z , 1) 

4. y’ = y - 2y 2 , (0, 0), (0, 0.25), (0, 0.5), (0, 1) 

5. / = * 2 - I fy, (1, -2) 

6. y' = 1 + sin .y, (-1, 0), (1, -4) 

7. y' = .v 3 + a- 3 , (0, 1) 

8 . / = 2xy + 1 , (- 1 , 2 ), ( 0 , 0 ), ( 1 , - 2 ) 

9. / = .y tanh* - 2, (-1, -2), (1, 0), (1, 2) 

10. y' = e v ' x , (1, 1), (2, 2), (3, 3) 
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ACCURACY 


Direction fields are very useful because you can see 
solutions (as many as you want) without solving the ODE, 
which may be difficult or impossible in terms of a formula. 
To get a feel for the accuracy of the method, graph a field, 
sketch solution curves in it, and compare them with the 
exact solutions. 

11. y f = sin \ttx 12. y f = 1/a 2 


13. / = -2y (Sol. y = ce~ 2x ) 


14. y = 3yfx (Sol. >• = ca 3 ) 

15. y' = -In a 


16-18 


MOTIONS 


A body moves on a straight line, with velocity as given, 
and y(/) is its distance from a fixed point 0 and t tune. Find 
a model of the motion (an ODE). Graph a direction field. 
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In il sketch a solution curve corresponding to the given 

initial condition. 

16. Velocity equal to the reciprocal of the distance, _y( 1 ) = 1 

17. Product of velocity and distance equal to y( 3) = -3 

18. Velocity plus distance equal to the square of tune, 

.v(0) = 6 

19. (Skydiver) Two forces act on a parachutist, the 
attraction by the earth mg (m = mass of person plus 
equipment, g = 9.8 m/sec 2 the acceleration of gravity) 
and the air resistance, assumed to be proportional to 
the square of the velocity v(t). Using Newton’s second 
law of motion (mass X acceleration = resultant of the 
forces), set up a model (an ODE for t ;(/)). Graph a 
direction field (choosing m and the constant of 
proportionality equal to 1 ). Assume that the parachute 
opens when v = 10 m/sec. Graph the corresponding 
solution in the field. What is the limiting velocity? 


20. CAS PROJECT. Direction Fields. Discuss direction 
fields as follows. 

(a) Graph a direction field for the ODE y = 1 - y 
and in it the solution satisfying y(0) = 5 showing 
exponential approach. Can you see the limit of any 
solution directly from the ODE? For what initial 
condition will the solution be increasing? Constant? 
Decreasing? 

(b) What do the solution curves of / = — rVy 3 look 
like, as concluded from a direction field. How do they 
seem to differ from circles? What are the isoclines? 
What happens to those curves when you drop the min us 
on the right? Do they look similar to familiar curves? 
First, guess. 

(c) Compare, as best as you can, the old and the 
computer methods, their advantages and disadvantages. 
Write a short report. 


1.! Separable ODEs. Modeling 

Many practically useful ODEs can be reduced to the form 

(i) g(y)y' = m 

by purely algebraic manipulations. Then we can integrate on both sides with respect to x, 
obtaining 


( 2 ) 


fg(y) y' dx = fm 


dx + c. 


On the left we can switch to y as the variable of integration. By calculus, y’ dx = dy, so 
that 


( 3 ) 


fg(y) dy = ff(x) 


dx 4- c. 


If / and g are continuous functions, the integrals in (3) exist, and by evaluating them we 
obtain a general solution of (1). This method of solving ODEs is called the method of 
separating variables, and (1) is called a separable equation, because in (3) the variables 
are now separated: x appears only on the right and y only on the left. 

EXAMPLE 1 A Separable ODE 

The ODE y* — I -4- y 2 is separable because it can be written 
dy 

j _j_ 2 ~~ dr. By integration, arctan y = a* + c or y = tan (a* + c). 
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EXAMPLE 2 


EXAMPLE 3 


It is very important to introduce the constant of integration immediately when the integration is performed. 
If we wrote arctan y = jc, then y = tan x, and then introduced c, we would have obtained v = tan .v + c, which 
is not a solution (when c ^ 0). Verify this. M 


Modeling 

The importance of modeling was emphasized in Sec. 1.1, and separable equations yield 
various useful models. Let us discuss this in terms of some typical examples. 

Radiocarbon Dating 2 

In September 1991 the famous Iceman (Oetzi), a mummy from the Neolithic period of the Stone Age found in 
the ice of the Oeutal Alps (hence the name “Oetzi”) in Southern Tyrolia near the Austrian-Italian border, caused 
a scientific sensation. When did Oetzi approximately live and die if the ratio of carbon 6 C 14 to carbon 6 C 12 in 
this mummy is 52.5% of that of a living organism? 

Physical Information . In the atmosphere and in living organisms, the ratio of radioactive carbon 6 C 14 (made 
radioactive by cosmic rays) to ordinary carbon 6 C 12 is constant. When an organism dies, its absorption of 6 C 14 
by breathing and eating terminates. Hence one can estimate the age of a fossil by comparing the radioactive carbon 
ratio in the fossil with that in the atmosphere. To do this, one needs to know the half-life of 6 C 14 which is 5715 
years ( CRC Handbook of Chemistry and Physics, 83rd ed.. Boca Raton: CRC Press. 2002. page 1 1-52, line 9). 

Solution . Modeling . Radioactive decay is governed by the ODE y' = ky (see Sec. 1.1. Example 5). By 
separation and integration (where r is time and >’o is the initial ratio of 6 C 14 to 6 C 12 ) 

= k dt. In |y| = kt 4- c, y = y 0 e fc£ . 


Next we use the half-life H = 5715 to determine k. When ; 
Thus, 

y 0 e kT1 = 0.5y o . e kH = 0.5. k = 


= H, half of the original substance is still present. 


In 0.5 
H 


0.693 

5715 


= -0.0001213. 


Finally, we use the ratio 52.5% for determining the time t when Oetzi 

In 0.525 


e kt _ ^—0.00012131 = q 525 


-0.0001213 


= 5312. 


died (actually, was killed). 

Answer: About 5300 years ago. 


Other methods show that radiocarbon dating values are usually too small. According to recent research, this is 
due to a variation in that carbon ratio because of industrial pollution and other factors, such as nuclear testing. M 

Mixing Problem 

Mixing problems occur quite frequently in chemical industry. We explain here how to solve the basic model 
involving a single tank. The tank in Fig. 9 contains 1000 gal of water in which initially 100 lb of salt is dissolved. 
Brine runs in at a rate of 10 gal/min, and each gallon contains 5 lb of dissoved salt. The mixture in the tank is 
kept uniform by stirring. Brine runs out at 10 gal/min. Find the amount of salt in the tank at any time r. 

Solution . Step 1. Setting up a model Let y(t) denote the amount of salt in the tank at time /. Its time rate 
of change is 

y = Salt inflow rate - Salt outflow rate “Balance law” 

5 lb times 10 gal gives an inflow of 50 lb of salt. Now. the outflow is 10 gal of brine. This is 10/1000 = 0.01 
(= 1%) of the total brine content in the tank, hence 0.01 of the salt content v(r), that is, 0.0 ly(/). Thus the model 
is the ODE 

(4) / = 50 - O.Oly = -0.01 0- - 5000). 


2 Method by WILLARD FRANK LIBBY (1908-1980), American chemist, who was awarded for this work 
the 1960 Nobel Prize in chemistry. 
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Step 2. Solution of the model The ODE (4) is separable. Separation, integration, and taking exponents on both 
sides gives 


— - = -0.01 dt. In \y - 5000| = -0.01/ + c*. y - 5000 = ce~ omt . 

y — 5000 

Initially the tank contains 100 lb of salt. Hence y(0) = 100 is the initial condition that will give the unique 
solution. Substituting y = 100 and / = 0 in the last equation gives 100 — 5000 = ce° = c. Hence c = —4900. 
Hence the amount of salt in the tank at time / is 

(5) ,v(0 = 5000 - 4900e _oou . 

This function shows an exponential approach to the limit 5000 lb; see Fig. 9. Can you explain physically that 
y(/) should increase with time? That its limit is 5000 lb? Can you see the limit directly from the ODE? 

The model discussed becomes more realistic in problems on pollutants in lakes (sec Problem Set 1.5. Prob. 
27) or drugs in organs. These types of problems are more difficult because the mixing may be imperfect and 
the flow rates (in and out) may be different and known only very roughly. ■ 




Tank 



Salt content y{t) 


Fig. 9. Mixing problem in Example 3 


EXAMPLE 4 Heating an Office Building (Newton’s Law of Cooling 3 ) 

Suppose that in Winter the daytime temperature in a certain office building is maintained at 70°F. The heating 
is shut off at 10 p.m. and turned on again at 6 a.m. On a certain day the temperature inside the building at 
2 a.M. was found to be 65°F. The outside temperature was 50°F at 10 p.m. and had dropped to 40°F by 6 A.M. 
What was the temperature inside the building when the heat was turned on at 6 a.m.? 

Physical information. Experiments show that the time rate of change of the temperature T of a body B (which 
conducts heat well, as, for example, a copper ball does) is proportional to the difference between T and the 
temperature of the surrounding medium (Newton’s law of cooling). 

Solution . Step 1. Setting up a model Let T(t) be the temperature inside the building and T A the outside 
temperature (assumed to be constant in Newton’s law). Then by Newton’s law. 

dT 

(6) ~jf =k(T ~ Ta) ' 

Such experimental laws are derived under idealized assumptions that rarely hold exactly. However, even if a 
model seems to fit the reality only poorly (as in the present case), it may still give valuable qualitative information. 
To see how good a model is, the engineer will collect experimental data and compare them with calculations 
from the model. 


3 Sir ISAAC NEWTON (1642-1727), great English physicist and mathematician, became a professor at 
Cambridge in 1669 and Master of the Mint in J699. He and the German mathematician and philosopher 
GOTTFRIED WILHELM LEIBNIZ ( 1 646-1 7 1 6) invented (independently) the differential and integral calculus. 
Newton discovered many basic physical laws and created the method of investigating physical problems by 
means of calculus. His Philosophiae naturalis principia mathematics {Mathematical Principles of Natural 
Philosophy , 1687) contains the development of classical mechanics. His work is of greatest importance to both 
mathematics and physics. 
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EXAMPLE 5 


Step 2. General solution . We cannot solve (6) because we do not know T A , just that it varied between 50°F 
and 40°F, so we follow the Golden Rule: If you cannot solve your problem, try to solve a simpler one. We 
solve (6) with die unknown function T A replaced with the average of the two known values, or 45°F. For physical 
reasons we may expect that this will give us a reasonable approximate value of T in the building at 6 a.m. 

For constant T A = 45 (or any other constant value) the ODE (6) is separable. Separation, integration, and 
taking exponents gives the general solution 

dT * 

r _ = k dt. In | T - 45| = kt + c* T(t) = 45 + ce kt (c = e c ). 

Step 3. Particular solution . We choose 10 p.m. to be / = 0. Then the given initial condition is T(0) = 70 and 
yields a particular solution, call it T p . By substitution, 

T(0) = 45 + ce° = 70, e = 70 - 45 = 25, T p (t) = 45 + 25e' u . 

Step 4. Determination of k. We use 7(4) = 65, where t = 4 is 2 a.m. Solving algebraically for k and inserting 
k into T p (t) gives (Fig. 10) 

T p ( 4) = 45 + 25e 4k = 65, e 4k = 0.8, k = $ In 0.8 = -0.056, T p (t) = 45 + 25e _0 0S6t . 
Step 5. Answer and interpretation. 6 a.m. is t = 8 (namely, S hours after 10 p.m.), and 

T p { 8) = 45 + 25<T 0056 ' 8 = 61 [°F]. 

Hence the temperature in the building dropped 9°F, a result that looks reasonable. M 



Fig. 10. Particular solution (temperature) in Example 4 

Leaking Tank. Outflow of Water Through a Hole (Torricelli's Law) 

This is another prototype engineering problem that leads to an ODE. It concerns the outflow of water from a 
cylindrical tank with a hole at the bottom (Fig. 1 1). You are asked to find the height of the water in the tank at 
any time if the tank has diameter 2 m, the hole has diameter 1 cm, and the initial height of the water when the 
hole is opened is 2.25 m. When will the tank be empty? 

Physical information. Under the influence of gravity the outflowing water has velocity 

(7) v(t) = 0.600 V2g/i(/) (Torricelli’s law 4 ), 

where h(t) is the height of the water above the hole at time t, and g = 980 cm/sec z = 32.17 ft/sec 2 is the 
acceleration of gravity at the surface of the earth. 

Solution . Step L Setting up the model. To get an equation, we relate the decrease in water level //(/) to the 
outflow. The volume A V of the outflow during a short time At is 

AV= Av At (A = Area of hole). 


4 EVANGELISTA TORRICELLI (1608-1647), Italian physicist, pupil and successor of GALILEO GALILEI 
(1564-1642) at Florence. The “contraction factor’ 0.600 was introduced by J. C. BORDA in 1766 because the 
stream has a smaller cross section than the area of the hole. 
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Ay must equal the change AV* of the volume of the water in the tank. Now 

AV^ = —B Ah {B = Cross-sectional area of tank) 

where A/i (> 0) is the decrease of the height MO of the water. The minus sign appears because the volume of 
the water in the tank decreases. Equating AV and AV* gives 

—B A/i = Av At. 

We now express v according to Torricelli’s law and then let At (the length of the time interval considered) 
approach 0 — this is a standard way of obtaining an ODE as a model. That is, we have 

Ah A A 

A T = ~B V = ~B 

and by letting Ar — » 0 we obtain the ODE 

dh A /— 

— = -26.56 - Vh. 
dt B 

where 26.56 = 0.600 V2 • 980. This is our model, a first-order ODE. 

Step 2. General solution . Our ODE is separable. AIB is constant. Separation and integration gives 

dh A j- A 

—r = -26.56 — dr and 2 Vh = c* - 26.56 - /. 

Vh B B 

Dividing by 2 and squaring gives h = (c - 13.2 Mt/B)*. Inserting 13.28/1 IB = 13.28 • 0.5 2 7t/I00 2 7t = 0.000332 
yields the general solution 

h(t) = (c - 0.000332/) 2 . 

Step 3. Particular solution. The initial height (the initial condition) is M0) = 225 cm. Substitution of t = 0 
and h = 225 gives from the general solution c 2 = 225, c = 15.00 and thus the particular solution (Fig. 1 1) 

h v {t) = (15.00 - 0.000332r) 2 . 

Step 4. Tank empty . h p (t) = 0 if t = 15.00/0.000332 = 45 181 [sec] = 12.6 [hours]. 

Here you see distinctly the importance of the choice of units — we have been working with the Cgs system, 
in which time is measured in seconds! We used g = 980 cm/sec 2 . 

Step 5. Checking. Check the result I 




Water level hit) in tank 


Fig. 11. Example 5. Outflow from a cylindrical tank ("leaking tank"). Torricelli's law 


Extended Method: Reduction to Separable Form 

Certain nonseparable ODEs can be made separable by transformations that Introduce for 
y a new unknown function. We discuss this technique for a class of ODEs of practical 
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EXAMPLE 6 


importance, namely, for equations 


( 8 ) 



Here, f is any (differentiable) function of y/x, such as sin {y/x) , O'/*) 4 , and so on. (Such 
an ODE is sometimes called a homogeneous ODE , a term we shall not use but reserve 
for a more important purpose in Sec. 1.5.) 

The form of such an ODE suggests that we set y/x = u\ thus, 

(9) y = ux and by product differentiation y = ux + u. 


Substitution into y' = f(y/x) then gives ux + u = f(u) or ux = f(u) — u. We see that 
this can be separated: 


( 10 ) 


du dx 

f(u) - u x 


Reduction to Separable Form 

Solve 


2xyy' = y z 


X 


2 


Solution . To get the usual explicit form, divide the given equation by 2vy, 


? = 


2 2 
y - x 


2x 


x 


2 xy lx 2y 

Now substitute y and y from (9) and then simplify by subtracting u on both sides. 


+ — — — , 
2 2 u 


u x - — — - — 
2 2 u 


2 u 


You see that in the last equation you can now separate the variables. 


2 u du _ dx 
l + « 2 


By integration. 


ln(l + w 2 ) = -In \x\ 4- c* = In 


x 


+ c*. 


Take exponents on both sides to get 1 + u 2 = dx or 1 + {y!x) 2 = dx. Multiply the last equation by x 2 to 
obtain (Fig. 12) ^ 

x 2 + y 2 = cx. Thus fx - -|) + y 2 = -j- . 


This general solution represents a family of circles passing through the origin with centers on the .r-axis. M 



Fig. 12. General solution (family of circles) in Example 6 
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1. (Constant of integration) An arbitrary constant of 
integration must be introduced immediately when the 
integration is performed. Why is this important? Give 
an example of your own. 


2-9 


GENERAL SOLUTION 


Find a general solution. Show the steps of derivation. Check 
your answer by substitution. 

2 . / + (* + 2)y 2 = 0 

3. y' = 2 sec 2 y 

4. y' = (y + 9xf (>• + 9.v = v ) 


5. yy' -I- 36.v = 0 

6. y' = (4a- 2 + y 2 )/(.v_v) 

7. y* sin tta = y cos 7TA 

8. xy* - §y 2 + y 

9. y' e 7rx — y 2 4* 1 


10-19 


INITIAL VALUE PROBLEMS 


Find the particular solution. Show the steps of derivation, 
beginning with the general solution. (L, R, b are constants.) 

10. yy* + 4a* = 0, y(0) = 3 


11. drfdt = —2/r, r(0) = r 0 

12. 2A*yy' = 3y 2 + a* 2 , y( 1) = 2 


13. L dlldt + R1 = 0, /(0) = / 0 

14. y' = y/A + (2a 3 /v) cos(a 2 ), y(Vn72) = Vn 

15. e 2 *y' = 2(a + 2)y 3 , y(0) = 1 fVE ~ 0.45 

16. Ay* = y 4- 4 a 5 cos 2 (y/A), y(2) = 0 

17. y'A In a = y, y(3) = In 81 

18. drldO = b[(drfd0) cos 0 4- r sin 0], r(|7r) = ^ 
0 < b < 1 

19. yy' = (a — l)e~ 3 ' 2 , v(0) = 1 


20. (Particular solution) Introduce limits of integration in 
(3) such that y obtained from (3) satisfies the initial 
condition y(A 0 ) = y 0 . Try the formula out on Prob. 1 9. 


21-36 


APPLICATIONS, MODELING 


21. (Curves) Find all curves in the Ay-plane whose 
tangents all pass through a given point (a, b). 

22. (Curves) Show that any (nonvertical) straight line 
through the origin of the Ay-plane intersects all solution 
curves of y* = g(y/x) at the same angle. 


23. (Exponential growth) If the growth rate of the amount 
of yeast at any time t is proportional to the amount 
present at that time and doubles in 1 week, how much 
yeast can be expected after 2 weeks? After 4 weeks? 

24. (Population model) If in a population of bacteria the 
birth rate and death rate are proportional to the number 


of individuals present, what is the population as a 
function of time? Figure out the limiting situation for 
increasing time and interpret it. 

25. (Radiocarbon dating) If a fossilized tree is claimed to 
be 4000 years old, what should be its 6 C 14 content 
expressed as a percent of the ratio of 6 C 14 to 6 C 12 in a 
living organism? 

26. (Gompertz growth in tumors) The Gompertz model 
is y' = —Ay In y ( A > 0), where y(/) is the mass of 
tumor cells at time t. The model agrees well with 
clinical observations. The declining growth rate with 
increasing y > 1 corresponds to the fact that cells in 
the interior of a tumor may die because of insufficient 
oxygen and nutrients. Use the ODE to discuss the 
growth and decline of solutions (tumors) and to find 
constant solutions. Then solve the ODE. 

27. (Dryer) If wet laundry loses half of its moisture 
during the first 5 minutes of drying in a dryer and if 
the rate of loss of moisture is proportional to the 
moisture content, when will the laundry be practically 
dry, say, when will it have lost 95% of its moisture? 
First guess. 

28. (Alibi?) Jack, arrested when leaving a bar, claims that 
he has been inside for at least half an hour (which 
would provide him with an alibi). The police check the 
water temperature of his car (parked near the entrance 
of the bar) at the instant of arrest and again 30 minutes 
later, obtaining the values 190°F and 110°F, 
respectively. Do these results give Jack an alibi? (Solve 
by inspection.) 

29. (Law of cooling) A thermometer, reading 10°C, is 
brought into a room whose temperature is 23°C. Two 
minutes later the thermometer reading is 1 8°C. How 
long will it take until the reading is practically 23°C, 
say, 22,8°C? First guess. 

30. (Torricelli’s law) How does the answer in Example 5 
(the time when the tank is empty) change if the 
diameter of the hole is doubled? First guess. 

31. (Torricelli’s l aw) Sh ow that (7) looks reasonable 
inasmuch as V2 gh(t) is the speed a body gains if it 
falls a distance h (and air resistance is neglected). 

32. (Rope) To tie a boat in a harbor, how many times must 
a rope be wound around a bollard (a vertical rough 
cylindrical post fixed on the ground) so that a man 
holding one end of the rope can resist a force exerted 
by the boat one thousand times greater than the man 
can exert? First guess. Experiments show that the 
change A S of the force 5 in a small portion of the rope 
is proportional to S and to the small angle A (f> in Fig. 

13. Take the proportionality constant 0.15. 
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Small 



33. (Mixing) A tank contains 800 gal of water in which 
200 lb of salt is dissolved. Two gallons of fresh water 
runs in per minute, and 2 gal of the mixture in the tank, 
kept uniform by stirring, runs out per minute. How 
much salt is left in the tank after 5 hours? 

34. WRITING PROJECT. Exponential Increase, Decay, 
Approach. Collect, order, and present all the information 
on the ODE y = ky and its applications from the text 
and the problems. Add examples of your own. 

35. CAS EXPERIMENT. Graphing Solutions. A CAS 
can usually graph solutions even if they are given by 
integrals that cannot be evaluated by the usual methods 
of calculus. Show this as follows. 


(A) Graph the curves for the seven initial value 
problems y = e~ x2f2 , y(0) = 0, ± 1 , ±2, ±3, common 
axes. Are these curves congruent? Why? 

(B) Experiment with approximate curves of nth partial 
sums of the Maclaurin series obtained by termwise 
integration of that of y in (A); graph them and describe 
qualitatively the accuracy for a fixed interval 
0 = x = b and increasing w, and then for fixed n and 
increasing b. 

(C) Experiment with y ' = cos (a* 2 ) as in (B). 

(D) Find an initial value problem with solution 

V = e? I e~* 2 dt and experiment with it as in (B). 

J o 

36. TEAM PROJECT. Torricelli’s Law. Suppose that 
the tank in Example 5 is hemispherical, of radius /?, 
initially full of water, and has an outlet of 5 cm 2 cross- 
sectional area at the bottom. (Make a sketch.) Set up 
the model for outflow. Indicate what portion of your 
work in Example 5 you can use (so that it can become 
part of the general method independent of the shape of 
the tank). Find the time t to empty the tank (a) for any 
R , (b) for R = 1 m. Plot t as function of R. Find the 
time when h = R/2 (a) for any /?, (b) for R = 1 m. 


1.4 Exact ODEs. Integrating Factors 

We remember from calculus that if a function u(x , y ) has continuous partial derivatives, 
its differential (also called its total differential) is 

du du 

du = — dx H dy. 

dx dy 

From this it follows that if u(x, y) = c = const, then du = 0. 

For example, if u = x 4- * 2 y 3 = c, then 

du = (1 + 2xy z ) dx + 3,v 2 y 2 dy — 0 
or 

t dy 1 4- 2 xy 3 
y = ~dx = 2xY 9 

an ODE that we can solve by going backward. This idea leads to a powerful solution 
method as follows. 

A first-order ODE M(x, y) + N(x, y)y' = 0, written as (use dy = y' dx as in Sec. 1.3) 


( 1 ) 


M(x, y ) dx + N(x, y) dy = 0 
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is called an exact differential equation if the differential form Mix, y)dx + N(x, y) dy 
is exact, that is, this form is the differential 

du du 

(2) du = — dx + — dy 

dx dy 

of some function u(x , y). Then (1) can be written 


du = 0. 


By integration we immediately obtain the general solution of (1) in the form 
(3) u(x t y) = c. 


This is called an implicit solution, in contrast with a solution y = h(x) as defined in Sec. 
1.1, which is also called an explicit solution, for distinction. Sometimes an implicit solution 
can be converted to explicit form. (Do this for x 2 + y 2 = 1.) If this is not possible, your 
CAS may graph a figure of the contour lines (3) of the function a(x, y) and help you in 
understanding the solution. 

Comparing (1) and (2), we see that (1) is an exact differential equation if there is some 
function u{x , y) such that 


(4) 


du 

(a) — = Af, 

dx 


du 

(b) —=N. 

dy 


From this we can derive a formula for checking whether (1) is exact or not, as follows. 

Let M and N be continuous and have continuous first partial derivatives in a region in 
the xy-plane whose boundary is a closed curve without self-intersections. Then by partial 
differentiation of (4) (see App. 3.2 for notation), 


dM _ 

d 2 u 

dy 

dy dx 

dN _ 

d 2 u 

dx 

dx dy 

s two second pai 

dM 

dN 

dy 

dx 


By the assumption of continuity the two second partial derivatives are equal. Thus 


(5) 


This condition is not only necessary but also sufficient for (1) to be an exact differential 
equation. (We shall prove this in Sec. 10.2 in another context. Some calculus books (e.g., 
Ref. [GR1 1] also contain a proof.) 

If (1) is exact, the function u(x , y) can be found by inspection or in the following 
systematic way. From (4a) we have by integration with respect to x 
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EXAMPLE 1 


( 6 ) 


u = Jm dx 4- k{y)\ 


in this integration, y is to be regarded as a constant, and k(y) plays the role of a “constant” 
of integration. To determine k(y), we derive du/dy from (6), use (4b) to get dk/dy, and 
integrate dkJdy to get k. 

Formula (6) was obtained from (4a). Instead of (4a) we may equally well use (4b). 
Then instead of (6) we first have by integration with respect to y 

(6*) u=jNdy + l(x). 


To determine Z( x), we derive du/d : c from (6*), use (4a) to get dl/dx, and integrate. We 
illustrate all this by the following typical examples. 

An Exact ODE 

Solve 

(7) cos (a 1 4- y) dx + (3 y 2 4- 2y 4- cos (x 4- y)) dy = 0. 


Solution . Step 1. Test for exactness. Our equation is of the form (1) with 

M = cos (x 4- y), 

N — 3y 2 4- 2y 4- cos (x 4- y). 


Thus 


dM . , 

— = -sin (x 4- y), 

oy 


dN_ 

dx 


-sin (x 4- y). 


From this and (5) we see that (7) is exact. 

Step 2. Implicit general solution. From (6) we obtain by integration 

(8) u = J M dx 4- k(y) = j cos (x 4- y) dx 4- k(y) = sin (x 4- y) 4- k(y). 


To find k(y ), we differentiate this formula with respect to y and use formula (4b), obtaining 

du dk „ 2 # t , 

— = cos (x 4- y) 4- — = N = 3y z 4- 2y 4- cos (x 4- y). 

ay dy 

Hence dk/dy = 3y 2 4- 2y. By integration, k = y 3 4- y 2 + c*. Inserting this result into (8) and observing (3), 
we obtain the answer 


u(x, y) = sin (x 4- y) 4- y 3 4* y 2 = c. 


Step 3 . Checking an implicit solution . We can check by differentiating the implicit solution w(x, y) = c implicitly 
and see whether this leads to the given ODE (7): 

On On „ 

(9) du = — dx+ —dy = cos (x + y)dx + (cos (x 4- y) 4- 3y 4- 2y) dy = 0. 


This completes the check. 
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EXAMPLE 2 


EXAMPLE 3 


An Initial Value Problem 

Solve the initial value problem 

(10) (cosy sinh a* + l ) rfx — siny cosh a* dy = 0, y(l) = 2. 

Solution . You may verify that the given ODE is exact. We find //. For a change, let us use (6*), 
u = — J sin y cosh x dy + /(.v) = cos y cosh a* + /(a*). 

From this, du/dx ~ cosy sinh a* + dUdx = M — cosy sinh a* + 1. Hence dlfdx = 1. By integration, 
Kx) = x + c*. This gives the general solution m(a\ y) = cosy cosh a* + x = c. From the initial condition, 
cos 2 cosh I + I = 0.358 = c. Hence the answer is cos y cosh x 4- x = 0.358. Figure 14 shows the particular 
solutions for c = 0. 0.358 (thicker curve), 1, 2. 3. Check that the answer satisfies the ODE. (Proceed as in 
Example 1.) Also check that the initial condition is satisfied. ■ 



Fig. 14. Particular solu.ions in Example 2 


WARNING! Breakdown in the Case of Nonexactness 

The equation — y dx + jr dy - 0 is not exact because M = — y and N - a\ so that in (5), dMfdy = - 1 but 
dN/dx = 1 . Let us show that in such a case the present method does not work. From (6), 

r du dk 

u — \ M dx + k( y) = -.\y + k(y), hence — = —x -f — . 

J * dy ay 

Now, dttftiy should equal N = a*, by (4b). However, this is impossible because k(y) can depend only on y. Try 
(6*): it will also fail. Solve the equation by another method that we have discussed. I 


Reduction to Exact Form. Integrating Factors 

The ODE in Example 3 is —y dx + x dy = 0. It is not exact. However, if we multiply it 
by 1/a: 2 , we get an exact equation [check exactness by (5)!], 


( 11 ) 


—y dx -f x dy 


ly y 1 ( y\ 


Integration of (1 i) then gives the general solution y/x = c = const. 
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EXAMPLE 4 


This example gives the idea. All we did was multiply a given nonexact equation, say, 
(12) P( x, y) dx -1- Q(x , y) dy = 0, 


by a function F that, in general, will be a function of both x and y . The result was an equation 
(13) FPdx + FQdy = 0 

that is exact, so we can solve it as just discussed. Such a function F(jc, y) is then called 
an integrating factor of (12). 


Integrating Factor 

The integrating factor in (11) is F = i/.v 2 . Hence in this case the exact equation (13) is 

— y dx -f .v dy / y\ y 

FPdx + FQdy = — j = d I - I = 0. Solution - = c. 


These are straight lines v = cx through the origin. 

It is remarkable that we can readily find other integrating factors for the equation — y dx + x dy = 0. namely. 
1/y 2 , l/(jty), and l/(x 2 + y 2 ). because 


(14) 


— y dx 4- jc dy 



—y dx + x dy 
xy 



-y dx + .v dy ( y \ — 

« « — = d I arctan — . ■ 

-v 2 + y 2 \ x ) 


How to Find Integrating Factors 

In simpler cases we may find integrating factors by inspection or perhaps after some trials, 
keeping (14) in mind. In the general case, the idea is the following. 

For M dx + N dy = 0 the exactness condition (4) is dM/dy = dN/dx . Hence for (13), 
FP dx + FQ dy = 0, the exactness condition is 

S cl 

(15) — (FP) = — (FQ). 

dy ox 

By the product rule, with subscripts denoting partial derivatives, this gives 


F y P 4- FP y = F X Q + FQ X . 

In the general case, this would be complicated and useless. So we follow die Golden Rule: 
If you cannot solve your problem, try to solve a simpler one — the result may be useful 
(and may also help you later on). Hence we look for an integrating factor depending only 
on one variable; fortunately, in many practical cases, there are such factors, as we shall 
see. Thus, let F = F(x). Then F y = 0, and F x = F f = dF/dx, so that (15) becomes 

FP y = F'Q + FQ X . 

Dividing by FQ and reshuffling terms, we have 


(16) 


1 dF 

t: =*■ 

F dx 


where 




dQ 

dx 


)■ 


This proves the following theorem. 
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THEOREM 1 


THEOREM 2 


EXAMPLE 5 


Integrating Factor F(x) 

(12) is such that the right side R of (16), depends only on x, then (12) has an 
integrating factor F = F(x), which is obtained by integrating (16) and taking 
exponents on both sides, 

(17) F(x) = exp f R(x) dx. 


Similarly, if F* = F*(y), then instead of (16) we get 

1 dF* 1 / dQ 

(IS) — — - = R*, where /?=*==- -p- 

F*dy P \dx 


dP_ 

dy 


) 


and we have the companion 


Integrating Factor F*{y] 

If ( 12) is such that the right side R* of (IS) depends only on y, then (12) has an 
integrating factor F* = F*(y), which is obtained from (18) in the form 

(19) F*(y) = exp J R*(y) dy. 


Application of Theorems 1 and 2. Initial Value Problem 

Using Theorem 1 or 2, Find an integrating factor and solve the initial value problem 

(20) {e x+v + ye v ) dx + (xe y - 1) dy = 0, >’(0) = - 1 

Solution. Step 1. Nonexactness. The exactness check fails: 

— = — (e x+y + ve») = e x+,J + e v + ye v but — = 4~ (*«" - 1) = d>. 
dy dy ' dx dx 


Step 2. Integrating factor. General solution . Theorem 1 fails because R [the right side of (16)] depends on 
both x and y, 

r= 75 = ~lf~T ( e * +v + * >>+ ye v - *">■ 

Q \ dy dx J xe y - 1 
Try Theorem 2. The right side of (18) is 

R* = ^ = ~—T Z («* - e* +v ~ e a - ve v ) = -1. 

P \ dx dy ) e x+v + ye v 

Hence (19) gives the integrating factor F*(y) = e~ y . From this result and (20) you get the exact equation 

(e x + y) dx + (,v - e~ v ) dy = 0. 

Test for exactness; you will get 1 on both sides of the exactness condition. By integration, using (4a), 

u = J ( e * + y) dx = e x + xy + k(y). 
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Differentiate this with respect to y and use (4b) to get 


bu 

dy 


dk 

= jc + ~r~ — N — x ~ e y y 
dy 


^ k = e- y + c*. 

dy 


Hence the general solution is 


u(.v, y) = e x 4 xy 4 e~ v - c. 

Step 3. Particular solution . The initial condition y(0) = 1 gives u( 0, — l)=l + 0 + e = 3.72. Hence the 
answer is e x 4 xy + e~ y = l 4 e = 3.72. Figure 15 shows several particular solutions obtained as level curves 
of m(a, y) = t\ obtained by a CAS, a convenient way in cases in which it is impossible or difficult to cast a 
solution into explicit form. Note the curve that (nearly) satisfies the initial condition. 

Step 4. Checking . Check by substitution that the answer satisfies the given equation as well as the initial 
condition. I 



Fig. 15. Particular solutions in Example 5 



1-20 


EXACT ODEs. INTEGRATING FACTORS 


Test for exactness. If exact, solve. If not, use an integrating 
factor as given or find it by inspection or from the theorems 
in the text. Also, if an initial condition is given, determine 
the corresponding particular solution. 

1. a 3 dx 4 y 3 dy = 0 2, (a — y)(*/A — dy) = 0 


3* — sin 7rx sinh y dx 4 cos irx cosh y dy = 0 

4. ( e y - ye x ) dx 4 (xe v — e x ) dy = 0 


5. 9x dx + 4 y dy = 0 


6. e x (cos y dx - sin y dy) — 0 

7. dr - Ire “ 2 * d$ = 0 


8 . (2a + Hy — y/x 2 ) dx + (2 y + 1/a — x/y 2 ) dy = 0 

9. (—y/x 2 + 2 cos 2x) ^ + (1/x - 2 sin 2 y) dy = 0 
10. -2a y sin (a 2 ) dx + cos (a 2 ) dy = 0 


11. — y dx + x dy — 0 

12. (e x+y — y) dx + (xe x+y + 1) dy = 0 

13. -3 y dx + 2a dy = 0, F( a, y) = y/x 4 

14. (a 4 4- y 2 ) dx - xy dy = 0, y(2) - 1 

15. e 2x (2 cos y dx — sin y dy) = 0, y(0) = 0 

16. —sin Ay (y dx + x dy) = 0, y(l) = tt 

17. (cos o)a + o> sin wx) dx -F e x dy = 0, y(0) = 1 

18. (cos Ay 4* x/y) dx + ( 1 4* (x/y) cos Ay) dy = 0 

19. dx 4- e- x (-e- y 4- 1) dy = 0, F = 

20. (sin y cos y + a cos 2 y) dx 4* a dy = 0 

21. Under what conditions for the constants A, B> C, D is 
(Ax 4- By) dx 4- (Ca 4 Dy) dy — 0 exact? Solve 
the exact equation. 
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22. CAS PROJECT. Graphing Particular Solutions 
Graph particular solutions of the following ODE. 
proceeding as explained. 

1 

(21) v cos x dx + — dx = 0 

y ‘ 

(a) Test for exactness. If necessary, find an integrating 
factor. Find the general solution u(x, y ) = c. 

(b) Solve (21) by separating variables. Is this simpler 
than (a)? 

(c) Graph contours u(x 9 v) — c by your CAS. (Cf. Fig. 
16.) 



(d) In another graph show the solution curves 
satisfying y( 0) = ±1. ±2, ±3, ±4. Compare the 
quality of (c) and (d) and comment. 

(e) Do the same steps for another nonexact ODE of 
your choice. 

23. WRITING PROJECT. Working Backward. Start 
from solutions u(x , y) = c of your choice, find a 
corresponding exact ODE, destroy exactness by a 
multiplication or division. This should give you a feet 
for the form of ODEs you can reach by the method of 
integrating factors. (Working backward is useful in 
other areas, too; Euler and other great masters 
frequently did it.) 

24. TEAM PROJECT. Solution by Several Methods. 
Show this as indicated. Compare the amount of work. 

(A) e y (sinh x dx 4* cosh x dy) = 0 as an exact ODE 
and by separation. 

(B) ( I + 2.v) cos y dx H- dy / cos y = 0 by Theorem 
2 and by separation. 

(C) (a 2 + y 2 ) dx — 2xy dy = 0 by Theorem 1 or 2 
and by separation with v = y/.v. 

(D) 3a 2 y dx + 4a 3 dy = 0 by Theorems 1 and 2 
and by separation. 

(E) Search the text and the problems for further ODEs 
that can be solved by more than one of the methods 
discussed so far. Make a list of these ODEs. Find 
further cases of your own. 


1.! Linear ODEs. Bernoulli Equation. 

Population Dynamics 

Linear ODEs or ODEs that can be transformed to linear form are models of various 
phenomena, for instance, in physics, biology, population dynamics, and ecology, as we 
shall see. A first-order ODE is said to be linear if it can be written 

(1) / + p(x)y = r(x). 

The defining feature of this equation is that it is linear in both the unknown function y 
and its derivative y f = dy/dx , whereas p and r may be any given functions of a. If in an 
application the independent variable is time, we write / instead of x. 

If the first term is /(a)/ (instead of/), divide the equation by /(a) to get the “standard 
form” (1), with / as the first term, which is practical. 

For instance, / cos a* + y sin a = x is a linear ODE, and its standard form is 
/ + y tan x = x sec a*. 

The function t(a) on the right may be a force, and the solution y(x) a displacement in 
a motion or an electrical current or some other physical quantity. In engineering, r(x) is 
frequently called the input, and y(x) is called the output or the response to the input (and, 
if given, to the initial condition). 
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Homogeneous Linear ODE. We want to solve (1) in some interval a < x < b, call it 
/, and we begin with the simpler special case that r(x) is zero for all x in /. (This is 
sometimes written r(jt) = 0.) Then the ODE (1) becomes 

(2) / + PW.Y = 0 


and is called homogeneous. By separating variables and integrating we then obtain 

= —p(x) dx , thus In [y| = —J p(x) dx + c*. 

Taking exponents on both sides, we obtain the general solution of the homogeneous 
ODE (2), 


(3) 


y(x) = ce~ spCx) ** 


(c = ±e c * when y ^ 0); 


here we may also choose c = 0 and obtain the trivial solution y(x) = 0 for all x in that 
interval. 

Nonhomogeneous Linear ODE. We now solve ( 1 ) in the case that r(x) in ( 1 ) is not 
everywhere zero in the interval J considered. Then the ODE (1 ) is called nonhomogeneous. 
It turns out that in this case, (1) has a pleasant property; namely, it has an integrating 
factor depending only on a\ We can find this factor F(x) by Theorem 1 in the last section. 
For this purpose we write (1) as 


(py — r) dx + dy = 0. 

This is P dx + Q dy = 0, where P = py — r and Q = 1. Hence die right side of (16) in 
Sec. 1.4 is simply 1(/? — 0) = p, so that (16) becomes 


1 dF 

7 lx =pW ' 


Separation and integration gives 

dF iif 

= p dx and In \F\ = J p dx. 

Taking exponents on both sides, we obtain the desired integrating factor F( x), 

F( x) = e fp dx . 

We now multiply (1) on both sides by this F. Then by the product rule, 
e Spdx (y' + py) = (e Spdx y) f = e fpdx r. 

By integrating the second and third of these three expressions with respect to x we get 

e Sv dx y = Je fpdx rdx + c. 

Dividing this equation by e !v dx and denoting the exponent fp dx by h, we obtain 


(4) 


y(x) = e h ^Je h rdx + cj , h = J p(x) dx. 
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EXAMPLE 1 


EXAMPLE 2 


(The constant of integration in h does not matter; see Prob. 2.) Formula (4) is the general 
solution of (1) in the form of an integral. Solving (1) is now reduced to the evaluation 
of an integral. In cases in which this cannot be done by the usual methods of calculus, 
one may have to use a numeric method for integrals (Sec. 19.5) or for the ODE itself 
(Sec. 21.1). 

The structure of (4) is interesting. The only quantity depending on a given initial 
condition is c. Accordingly, writing (4) as a sum of two terms, 

(4*) y(x) = e~ h Je h r dx + ceT h , 

we see the following: 

(5) Total Output = Response to the Input r 4* Response to the Initial Data. 


First-Order ODE, General Solution 

Solve the linear ODE 

r 2x 

y - y = e . 

Solution . Here, 

p = — 1. r = e 2 *, h = Jp dx = —x 
and from (4) we obtain the general solution 

y(x) = e x Q>*«**<& + c) = e x (e x + c) = ce* + e 2 *. 

From (4*) and (5) we see that the response to the input is e 2 *. 

In simpler cases, such as the present, we may not need the general formula (4). but may wish to proceed 
directly, multiplying the given equation by e h - e~ x . This gives 

</ - y)e~ x = (ye-*)' = = e* 

Integrating on both sides, we obtain the same result as before: 

ye~ x = e x + c, hence y = e 2 * + ce x . M 


First-Order ODE, initial Value Problem 

Solve the initial value problem 

y* + y tan.v = sin 2 \-, y(0) = 1. 

Solution . Here p = tan a \ r = sin 2x = 2 sin x cos x\ and 

jp dx — j tan x dx = In |sec x\. 

From this we see that in (4), 

e h = sec x , e~ h = cos .v, e h r = (sec .v)(2 sin x cos x) = 2 sin .v, 

and the general solution of our equation is 

y(x) = cos x ^2 Jsin .v dx + c j = c cos x - 2 cos 2 x. 

From this and the initial condition, 1 = c • I - 2 • l 2 ; thus c = 3 and the solution of our initial value problem 

* s y ~ ^ cos A ~ ^ cos Xm Here 3 cos x is the response to the initial data, and —2 cos 2 jc is the response to the 
input sin 2x. m 
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EXAMPLE 3 


Hormone Level 

Assume that the level of a certain hormone in the blood of a patient varies with time. Suppose that the time rate 
of change is the difference between a sinusoidal input of a 24-hour period from the thyroid gland and a continuous 
removal rate proportional to the level present. Set up a model for the hormone level in the blood and find its 
general solution. Find the particular solution satisfying a suitable initial condition. 

Solution. Step 1. Setting up a model Let y(/) be the hormone level at time /. Then the removal rate is Ky{t). 
The input rate is A + B cos (2 777/24), where A is the average input rate, and A ^ B to make the input nonnegative. 
(The constants A, B , and K can be determined by measurements.) Hence the model is 

y\t) = In — Out = A 4 B cos (-^777) - Ky{t) or y + Ky = A + B cos (^7r/). 

The initial condition for a particular solution y pa ,t .Vpart(O) = v 0 with t = 0 suitably chosen, e.g., 6:00 a.m. 

Step 2. General solution. In (4) we have p = K = const, h = Kt , and r = A + B cos (-^777). Hence (4) gives 
the general solution 

\{r) = e~ Kt je Kt ^A + B cos (It + ce~ Kt 

■ [j ♦ -nsFT? 5 - I )] + 

A B ( 777 7 Tt\ 

= — + o % I 144 K cos — 4- 1 277 sin — 1 + ce Kt . 

K 144A -2 + w* \ 12 12 / 


The last term decreases to 0 as t increases, practically after a short time and regardless of c (that is. of the initial 
condition). The other part of y(t) is called the steady-state solution because it consists of constant and periodic 
terms. The entire solution is called the transient-state solution because it models the transition from rest to the 
steady state. These terms are used quite generally for physical and other systems whose behavior depends on lime. 

Step 3. Particular solution. Setting / = 0 in y(t) and choosing y 0 = 0, we have 




thus 


B 




K 144 K 2 + i? 


1 44 AT. 


Inserting this result into y(f), we obtain the particular solution 


A B ( tj7 Trt\ (A 144 KB \ 

v„ a rt(0 ~ -P o o l 144 K cos + 1 2t7 sin ■ I — I — + 5 1 C )< 

- portv K 144A- 2 + IT 2 \ 12 12 ) \K 144* 2 + w 2 / 


-Kt 


with the steady-state part as before. To plot y pQrt we must specify values for the constants, say, A = B = 1 and 
K = 0.05, Figure 17 shows this solution. Notice that the transition period is relatively short (although K is small), 
and the curve soon looks sinusoidal: this is the response to the input A 4- B cos (^tt/) = 1 4- cos H 



Fig. 17. Particular solution in Example 3 
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EXAMPLE 


Reduction to Linear Form. Bernoulli Equation 

Numerous applications can be modeled by ODEs that are nonlinear but can be transformed 
to lineai* ODEs. One of the most useful ones of these is the Bernoulli equation 5 

(6) y' + p(x)y = g(x)y a (a any real number). 

If a = 0 or a = 1, Equation (6) is linear. Otherwise it is nonlinear. Then we set 

u(x) = Mr)] 1 " 41 . 

We differentiate this and substitute y f from (6), obtaining 

w' = (i - tf)y“V = 0 - <*)y~ a (gy a ~ py )• 

Simplification gives 

u' = (1 - a)(g - py 1 " a ), 

where y 1-a = u on the right, so that we get the linear ODE 

(7) u* + (1 - a)pit = (1 - a)g. 

For further ODEs reducible to linear from, see Ince’s classic [All] listed in App. 1. 
See also Team Project 44 in Problem Set 1.5. 

Logistic Equation 

Solve the following Bernoulli equation, known as the logistic equation (or Verhulst equation 6 ): 

(8) >>' = Ay - By 2 

Solution . Write (8) in the form (6). that is, 

y' - Ay = - By 2 

to see that a - 2, so that u = y 1_a = y" 1 . Differentiate this u and substitute y from (8), 
u - -v"V = -y“ 2 (/\y - By 2 ) = B - Ay~\ 

Tlie last term is —Av ~ 1 = -Au. Hence we have obtained the linear ODE 


5 JAKOB BERNOULLI (1654-1705), Swiss mathematician, professor at Basel, also known for his contribution 
to elasticity theory and mathematical probability. Tlie method for solving Bernoulli’s equation was discovered by 
the Leibniz in 1696. Jakob Bernoulli’s students included his nephew NIKLAUS BERNOULLI (1687-1759). who 
contributed to probability theory and infinite series, and his youngesi brother JOHANN BERNOULLI (1667-1748), 
who had profound influence on the development of calculus, became Jakob’s successor at Basel, and had among 
his students GABRIEL CRAMER (see Sec. 7.7) and LEONHARD EULER (see Sec. 2.5). His son DANIEL 
BERNOULLI (1700-1782) is known for his basic work in fluid flow and the kinetic theory of gases. 

6 PIERRE-FRANC0TS VERHULST, Belgian statistician, who introduced Eq. (8) as a model for human 
population growth in 1838. 
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u + An = B, 


The general solution is [by (4)] 


u = ce~ At + B/A. 


Since it = 1/y, this gives the general solution of (8), 


(9) 


1 _ » 

it ce~ M + BIA 


Directly from (8) vve see that y = 0 (y(r) = 0 for all /) is also a solution. 


(Fig. 18). 



Fig. 18. Logistic population model. Curves (9) in Example 4 with A/B — 4 


Population Dynamics 

The logistic equation (8) plays an important role in population dynamics, a field that 
models the evolution of populations of plants, animals, or humans over time t. If B = 0, 
then (8) is y* = dy/dt = Ay. In this case its solution (9) is y = (I /c)e At and gives exponential 
growth, as for a small population in a large country (the United States in early times!). 
This is called Malthus’s law . (See also Example 3 in Sec. 1.1.) 

The term — By 2 in (8) is a “braking term” that prevents the population from growing 
without bound. Indeed, if we write y f = Ay[l — (BlA)y] t we see that if _v < A/B, then 
y f > 0, so that an initially small population keeps growing as long as y < A/B. But if 
y > A/B , then y r < 0 and the population is decreasing as long as y > A/B . The limit is 
the same in both cases, namely, A/B . See Fig. 18. 

We see that in the logistic equation (8) the independent variable t does not occur 
explicitly. An ODE y 9 = /(f, y) in which t does not occur explicitly is of the form 

(10) y 9 = /O’) 


and is called an autonomous ODE. Thus the logistic equation (8) is autonomous. 

Equation (10) has constant solutions, called equilibrium solutions or equilibrium 
points. These are determined by the zeros of f(y), because /(y) = 0 gives / = 0 by (10); 
hence y = const. These zeros are known as critical points of (10). An equilibrium 
solution is called stable if solutions close to it for some t remain close to it for all further 
t. It is called unstable if solutions initially close to it do not remain close to it as t 
increases. For instance, y = 0 in Fig. 18 is an unstable equilibrium solution, and y = 4 
is a stable one. 
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EXAMPLE 5 Stable and Unstable Equilibrium Solutions. “Phase Line Plot” 

The ODE y - (y — 1 )(y — 2) has the stable equilibrium solution y x - 1 and the unstable y 2 = 2, as the 
direction field in Fig. 19 suggests. The values y x and y 2 are the zeros of the parabola /( r v) = (y - 1 )(y — 2) 
in the figure. Now. since the ODE is autonomous, we can “condense” the direction field to a “phase line plot” 
giving y x and y 2 , and the direction (upward or downward) of the arrows in the field, and thus giving information 
about the stability or instability of the equilibrium solutions. ■ 



(A) (B) (C) 

Fig. 19. Example 5. (A) Direction field. (B) “Phase line”. (C) Parabola /(y) 


A few further population models will be discussed in the problem set. For some more 
details of population dynamics, see C. W. Clark, Mathematical Bioeconomics , New York, 
Wiley, 1976. 

Further important applications of linear ODEs follow in the next section. 




1. (CAUTION!) Show that e ln x = I/a (not -a*) and 

g— ln(sec a:) = co$ ^ 

2. (Integration constant) Give a reason why in (4) you 
may choose die constant of integration in fp dx to be 
zero. 

|3— 17 1 GENERAL SOLUTION. INITIAL VALUE 

PROBLEMS 

Find the general solution. If an initial condidon is given, 
find also the corresponding particular solution and graph or 
sketch it. (Show the details of your work.) 

3. y' 4- 3.5y = 2.8 

4. y f = 4y 4- a 

5. y' + 1.25y = 5, y(0) = 6.6 


6. x 2 y' 4- 3Ay = 1 /a*, y(l) = —1 

7. y f + ky = e 2kx 

8. y' 4- 2y = 4 cos 2a*, y(j7r) = 2 

9. y' = 6(y - 2.5) tanh 1.5a 

10. y' 4- 4a 2 >* = (4a 2 - x)e~ x2/2 

11. y' 4- 2y sin 2a = 2e co * **, y( 0) = 0 

12. y f tan x = 2y - 8, yi^ir) = 0 

13. y' 4- 4y cot 2a = 6 cos 2a, y($7r) = 2 

14. y' 4* y tan a = e ~ 0 01x cos a, y(0) = 0 

15. y' 4- yf a 2 = 2xe 1/x , y(l) = 13.86 

16. y' cos 2 a + 3y = 1, y^Tr) = § 

17. A 3 y' 4* 3A 2 y = 5 sinh 10 a 
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18-24 


NONLINEAR ODEs 


Using a method of this section or separating variables, find 
the general solution. If an initial condition is given, find 
also the particular solution and sketch or graph it. 

18. y' + y = y 2 , y(0) = — 1 

19. y' = 5.7 y - 6.5y 2 

20. (a * 2 -r l)y' = -tan y, y( 0) = \tt 

21. y + (x + l)y = e x *y 3 , y( 0) = 0.5 

22. y* sin 2 y + x cos 2y = 2x 

23. 2 >V + y 2 sin x = sin a, y( 0) = V2 

24. y* + x 2 y = ( e ~ x 3 sinh A')/(3y 2 ) 


25-36 


FURTHER APPLICATIONS 


25. (Investment programs) Bill opens a retirement 
savings account with an initial amount y 0 and then adds 
$& to the account at the beginning of every year until 
retirement at age 65. Assume that the interest is 
compounded continuously at the same rate R over the 
years. Set up a model for the balance in the account 
and fmd the general solution as well as the particular 
solution, letting t = 0 be the instant when the account 
is opened. How much money will Bill have in the 
account at age 65 if he starts at 25 and invests $1000 
initially as well as annually, and the interest rate R is 
6%? How much should he invest initially and annually 
(same amounts) to obtain the same final balance as 
before if he starts at age 45? First, guess. 

26. (Mixing problem) A tank (as in Fig. 9 in Sec. 1.3) 
contains 1000 gal of water in which 200 lb of salt is 
dissolved. 50 gal of brine, each gallon containing 
(1 + cos t) lb of dissolved salt, runs into the tank per 
minute. The mixture, kept uniform by stirring, runs out 
at the same rate. Find the amount of salt in the tank at 
any time t (Fig. 20). 



Fig. 20. Amount of salt y(f) in the tank in Problem 26 


27. (Lake Erie) Lake Erie has a water volume of about 
450 km 3 and a flow rate (in and out) of about 1 75 km 3 
per year. If at some instant the lake has pollution 
concentration p = 0.04%, how long, approximately, 
will it take to decrease it to p/2, assuming that the 
inflow is much cleaner, say, it has pollution 


concentration pi 4, and the mixture is uniform (an 
assumption that is only very imperfectly true)? First, 
guess. 

28. (Heating and cooling of a building) Heating and 
cooling of a building can be modeled by the ODE 

T f = k x (T - T a ) + k 2 (T - TJ + />, 

where T = T(t) is the temperature in the building at 
lime /, T a the outside temperature, T w the temperature 
wanted in the building, and P the rate of increase of T 
due to machines and people in the building, and k x and 
k 2 are (negative) constants. Solve this ODE, assuming 
P = const , T w = const, and T a varying sinusoidally 
over 24 hours, say, T a = A - C cos (2fl724)f. Discuss 
the effect of each term of the equation on the solution. 

29. (Drug iqjection) Find and solve the model for drug 
injection into the bloodstream if, beginning at t = 0, a 
constant amount A g/min is injected and the drug is 
simultaneously removed at a rate proportional to the 
amount of the drug present at time /. 

30. (Epidemics) A model for the spread of contagious 
diseases is obtained by assuming that the rate of spread 
is proportional to the number of contacts between 
infected and noninfected persons, who are assumed to 
move freely among each other. Set up the model. Find 
the equilibrium solutions and indicate their stability or 
instability. Solve the ODE. Find the limit of the 
proportion of infected persons as t — > sc and explain 
what it means. 

31. (Extinction vs. unlimited growth) If in a population 
y(t) the death rate is proportional to the population, and 
the birth rate is proportional to the chance encounters 
of meeting mates for reproduction, what will the model 
be? Without solving, find out what will eventually 
happen to a small initial population. To a large one. 
Then solve the model. 

32. (Harvesting renewable resources. Fishing) Suppose 
that the population y(t) of a certain kind of fish is given 
by the logistic equation (8), and fish are caught at a 
rate Hy proportional to y. Solve this so-called Schaefer 
model. Find the equilibrium solutions y x and y 2 (> 0) 
when H < A. The expression Y = Hy 2 is called the 
equilibrium harvest or sustainable yield corresponding 
to H. Why? 

33. (Harvesting) In Prob. 32 find and graph the solution 
satisfying y(0) = 2 when (for simplicity) A = B = l 
and H = 0.2. What is the limit? What does it mean? 
What if there were no fishing? 

34. (Intermittent harvesting) In Prob. 32 assume that you 
fish for 3 years, then fishing is banned for the next 3 
years. Thereafter you start again. And so on. This is 
called intermittent harvesting . Describe qualitatively 
how the population will develop if intermitting is 
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continued periodically. Find and graph the solution for 
the first 9 years, assuming that A — B — 1, H — 0.2, 
and y( 0) = 2. 



Fig. 21. Fish population in Problem 34 


35. (Harvesting) If a population of mice (in multiples of 
1000) follows the logistic law with A = 1 and B = 0.25, 
and if owls catch at a time rate of 1 0% of the population 
present, what is the model, its equilibrium harvest for 
that catch, and its solution? 

36. (Harvesting) Do you save work in Prob. 34 if you first 
transform the ODE to a Linear ODE? Do this 
transformation. Solve the resulting ODE. Does the 
resulting y(f) agree with that in Prob. 34? 
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GENERAL PROPERTIES OF LINEAR ODEs 


These properties are of practical and theoretical importance 
because they enable us to obtain new solutions from given 
ones. Thus in modeling, whenever possible, we prefer linear 
ODEs over nonlinear ones, which have no similar 
properties. 

Show that nonhomogeneous linear ODEs (1) and 
homogeneous linear ODEs (2) have the following 
properties. Illustrate each property by a calculation for two 
or three equations of your choice. Give proofs. 


37. The sum y x 4- y 2 of two solutions y x and y 2 of the 
homogeneous equation (2) is a solution of (2), and so 
is a scalar multiple ay x for any constant a. These 
properties are not true for (1)! 


38. y = 0 (that is, y(x) = 0 for all jc, also written y(x) = 0) 
is a solution of (2) [not of (1) if r(x) =£ 0!], called the 

trivial solution. 


39. The sum of a solution of (1) and a solution of (2) is a 
solution of (1). 

40. The difference of two solutions of ( I ) is a solution of (2). 

41. If y x is a solution of (1), what can you say about ctj? 

42. If y x and y 2 are solutions of y[ 4 py l — r x and 
y% + py* - r 2l respectively (with the same /?!), what 
can you say about the sum y x 4 y 2 ? 


43. CAS EXPERIMENT, (a) Solve the ODE 

y — ylx = — jc - 1 cos(l/x). Find an initial condition 
for which the arbitrary constant is zero. Graph the 
resulting particular solution, experimenting to obtain 
a good figure near x = 0. 

(b) Generalizing (a) from n = 1 to arbitrary /z, solve 
the ODE y 1 — tiyfx = — x n ~ 2 cos (l/.v). Find an initial 
condition as in (a), and experiment with the graph. 

44. TEAM PROJECT. Riccati Equation, Clairaut 
Equation. A Riccati equation is of the form 

(11) y' 4 p(x)y = g(x)y 2 4 h(x). 

A Clairaut equation is of the form 

(12) y = xy' 4 g(y'). 

(a) Apply die transformation y = Y 4 I/m to the 
Riccati equation (1 1 ), where Y is a solution of (1 1), and 
obtain for it the linear ODE u 4 (2 Yg — p)u = —g. 
Explain the effect of the transformation by writing it 
as y = Y 4 t>, v = I/m. 

(b) Show that y = Y = x is a solution of 

>•' - (2a- 3 +!)>>= —A 2 )’ 2 - A 4 - A + 1 
and solve this Riccati equation, showing the details. 

(c) Solve y' 4 (3 - 2a 2 sin x)y 

= -y 2 sin:r 4 2x 4 3jc 2 - x 4 sinx, using (and 
verifying) that y = x 2 is a solution. 

(d) By working “backward” from the M-equation find 
further Riccati equations that have relatively simple 
solutions. 

(e) Solve the Clairaut equation y = xy f 4 1 /y'.Hint. 
Differentiate this ODE with respect to x. 

(f) Solve the Clairaut equation y' 2 - xy' 4 y = 0 
in Prob. 16 of Problem Set 1.1. 

(g) Show that the Clairaut equation (12) has as 
solutions a family of straight lines y = cx 4 g(c ) and 
a singular solution determined by g'(s ) = —a, where 
s = y f , that forms the envelope of that family. 

45. (Variation of parameter) Another method of 
obtaining (4) results from the following idea. Write 
(3) as cy*, where y* is the exponential function, 
which is a solution of the homogeneous linear ODE 
y* 7 4 py* — 0. Replace the arbitrary constant c in (3) 
with a function u to be determined so that the resulting 
function y = uy* is a solution of the nonhomogeneous 
linear ODE y 4 py = r. 

46. TEAM PROJECT. Transformations of ODEs. We 
have transformed ODEs to separable form, to exact 
form, and to linear form. The purpose of such 
transformations is an extension of solution methods to 
larger classes of ODEs. Describe the key idea of each 
of these transformations and give three typical 
examples of your choice for each transformation, 
showing each step (not just the transformed ODE). 
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1.6 Orthogonal Trajectories. Optional 

An important type of problem in physics or geometry is to find a family of curves that 
intersect a given family of curves at right angles. The new curves are called orthogonal 
trajectories of the given curves (and conversely). Examples are curves of equal 
temperature (i isotherms ) and curves of heat flow, curves of equal altitude (contour lines) 
on a map and curves of steepest descent on that map, curves of equal potential 
(equipotential curves , curves of equal voltage — the concentric circles in Fig. 22), and 
curves of electric force (the straight radial segments in Fig. 22). 



Fig. 22. Equipotential lines and curves of electric force (dashed) 
between two concentric (black) circles (cylinders in space) 


Here the angle of intersection between two curves is defined to be the angle between 
the tangents of the curves at the intersection point. Orthogonal is another word for 
perpendicular. 

In many cases orthogonal trajectories can be found by using ODEs, as follows. Let 
(1) G(*,y,c) = 0 

be a given family of curves in the jty-plane, where each curve is specified by some value 
of c. This is called a one-parameter family of curves, and c is called the parameter 
of the family. For instance, a one-parameter family of quadratic parabolas is given by 
(Fig. 23) 


y = cx 2 or, written as in (1), G(a, y 9 c) = y — cx 2 = 0. 

Step 1. Find an ODE for which the given family is a general solution. Of course, this 
ODE must no longer contain the parameter c. In our example we solve algebraically for 
c and then differentiate and simplify; thus, 

_>L = C ~ ft? _ o 

a - 2 ’ x 4 0> 

hence 


, 2 y 


y 


X 
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The last of these equations is the ODE of the given family of curves. It is of the form 

( 2 ) / =f(x,y)- 


Step 2. Write down the ODE of the orthogonal trajectories, that is, the ODE whose general 
solution gives the orthogonal trajectories of the given curves. This ODE is 


( 3 ) 


/fey) 


with the same / as in (2). Why? Well, a given curve passing through a point (x 0 , y 0 ) has 
slope f(x 0i y 0 ) at that point, by (2). The trajectory through (x 0 , y 0 ) has slope — l//(x 0 , yo) 
by (3). The product of these slopes is - 1 , as we see. From calculus it is known that this 
is the condition for orthogonality (perpendicularity) of two straight lines (the tangents at 
(a 0 , y 0 ))> hence of the curve and its orthogonal trajectory at ( x 0 , y 0 ). 

Step 3. Solve (3). 

For our parabolas y = cx 2 we have y = 2 y/x. Hence their orthogonal trajectories are 
obtained from y f = —a/2 y or 2yy f 4- x = 0. By integration, y 2 4- §x 2 = c*. These are 
the ellipses in Fig. 23 with semi-axes V2c* and Vc*. Here, c* > 0 because c* = 0 gives 
just the origin, and c * < 0 gives no real solution at all. 



Fig. 23. Parabolas and orthogonal trajectories (ellipses) in the text 



1-12 


ORTHOGONAL TRAJECTORIES 


Sketch or graph some of the given curves. Guess what their 
orthogonal trajectories may look like. Find these 
trajectories. 


(Show the details of your work.) 

1. y = Ax 4- c 2. y = c/x 

3. y = ca 4. y 2 = 2x 2 4 c 

5, x 2 y = c 6. y = ce~ 3x 


7. y = ce x *' z 8. * 2 - y* = c 

9. 4x 2 + y 2 = c 10. x = cVy 

11. x = ce y,i 12. x 2 + (y — c ) 2 = c 2 

1 13-15 1 OTHER FORMS OF THE ODEs (2) AND (3) 

13. 0 as independent variable) Show that (3) may be 
written dxtdy = -f(x, y). Use this form to find the 
orthogonal trajectories of y = 2x + ce~ x . 
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14, (Family g(x, y) = c) Show that if a family is given as 
g(x, y) = c, then the orthogonal trajectories can be 
obtained from the following ODE, and use the latter to 
solve Prob. 6 written in the form g(x, y) = c. 

dy _ dgldy 
dx dgfdx 

15. (Cauchy-Riemann equations) Show that for a family 
«(x, y) = c = const the orthogonal trajectories 
u(x, y) = c* = const can be obtained from the following 
Cauchy-Riemann equations (which are basic in 
complex analysis in Chap. 13) and use them to find the 
orthogonal trajectories of e x sin y = const. (Here, 
subscripts denote partial derivatives.) 


U X Vyy Uy V x 


16-20 


APPLICATIONS 


16. (Fluid flow) Suppose that the streamlines of the flow 
(paths of the particles of the fluid) in Fig. 24 are 
>P(a\ y) = xy = const. Find their orthogonal trajectories 
(called equipotential lines, for reasons given in Sec. 
18.4). 



Fig. 24. Flow in a channel in Problem 16 

17. (Electric field) Let the electric equipotential lines 
(curves of constant potential) between two concentric 
cylinders (Fig. 22) be given by u(x, y) — x 2 4- y 2 = c. 
Use the method in the text to find their orthogonal 
trajectories (the curves of electric force). 


18. (Electric field) The lines of electric force of two 
opposite charges of the same strength at (—1, 0) and 
(1, 0) are the circles through (-1,0) and (1, 0). Show 
that these circles are given by „v 2 + (y — c) 2 = 1 4- c 2 . 
Show that the equipotential lines (orthogonal 
trajectories of those circles) are the circles given by 
(x 4- c*) 2 4- y 2 = c* 2 - 1 (dashed in Fig. 25). 



Fig. 25. Electric field in Problem 18 


19. (Temperature field) Let the isotherms (curves of 
constant temperature) in a body in the upper half-plane 
y > 0 be given by 4 jc 2 + 9y 2 = c. Find the orthogonal 
trajectories (the curves along which heat will flow in 
regions filled with heat-conducting material and free 
of heat sources or heat sinks). 

20. TEAM PROJECT. Conic Sections. (A) Stale the 
main steps of the present method of obtaining orthogonal 
trajectories. 

(B) Find conditions under which the orthogonal 
trajectories of families of ellipses x 2 /# 2 4- y 2 /Z? 2 = c are 
again conic sections. Illustrate your result graphically 
by sketches or by using your CAS. What happens if 
fl^0?Iffc->0? 

(C) Investigate families of hyperbolas 
x 2 /a 2 — y 2 fb 2 = c in a similar fashion. 

(D) Can you find more complicated curves for which 
you get ODEs that you can solve? Give it a try. 


l.i Existence and Uniqueness of Solutions 

The initial value problem 

l/l + \yl = 0, ,v(0) = 1 

has no solution because y = 0 (that is, y(x) = 0 for all x) is the only solution of the ODE. 
The initial value problem 


/ = 2x , 


y(0) = 1 
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has precisely one solution, namely, y = x 2 4- 1. The initial value problem 

V = v(0) = 1 

has infinitely many solutions, namely, y = 1 4* cx, where c is an arbitrary constant because 
y(0) = 1 for all c. 

From these examples we see that an initial value problem 
(1) y' = f(x, y), y(x 0 ) = y 0 

may have no solution, precisely one solution, or more than one solution. This fact leads 
to the following two fundamental questions. 


Problem of Existence 

Under what conditions does an initial value problem of the form (l) have at least 
one solution {hence one or several solutions)? 

Problem of Uniqueness 

Under what conditions does that problem have at most one solution {hence excluding 
the case that is has more than one solution)? 


Theorems that state such conditions are called existence theorems and uniqueness 
theorems, respectively. 

Of course, for our simple examples we need no theorems because we can solve these 
examples by inspection; however, for complicated ODEs such theorems may be of 
considerable practical importance. Even when you are sure that your physical or other 
system behaves uniquely, occasionally your model may be oversimplified and may not 
give a faithful picture of the reality. 


THEOREM 1 


Existence Theorem 

Let the right side f(x, y) of the ODE in the initial value problem 

(1) / = f{x, y\ y{x 0 ) = y 0 
be continuous at all points (.v, y) in some rectangle 

R: \x - a‘o| < a, \ y - y 0 | < b (Fig. 26) 

and bounded in R; that is, there is a number K such that 

(2) |/(a\ y)l s K for all (x, y) in R. 

Then the initial value problem (1) has at least one solution y(x). This solution exists 
at least for all x in the subinterval |.v - x 0 | < a of the interval |x - x 0 | < a; here, 
a is the smaller of the two numbers a and b/K. 
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THEOREM 


>’o " 6 



R 


-? 

1 

1 

1 

1 

1 

k ! 

1 

1 1 1 
1 1 1 
1 1 1 


Fig. 26. Rectangle R in the existence and uniqueness theorems 


(Example of Boundedness. The function /( x, y) = x 2 + y 2 is bounded (with K = 2) in the 
square \x\ < 1 , [y| < 1 . The function /( x, y) = tan (x -I- v) is not bounded for \x + y\ < tt/1. 
Explain!) 


Uniqueness Theorem 

Let f and its partial derivative f y = df/dy be continuous for all {x, v) in the 
rectangle R (Fig. 26) and bounded, say, 

(3) (a) |/(x, y)| ^ K, (b) |/„(x, y)| M for all (x, y) in R. 

Then the initial value problem (1) has at most one solution y(x). Thus, by Theorem 1, 
the problem has precisely one solution. This solution exists at least for all x in that 
subinterval \x — jc 0 | < a. 


Understanding These Theorems 

These two theorems take care of almost all practical cases. Theorem 1 says that if f(x , y) 
is continuous in some region in the Ay-plane containing the point (x 0 , y 0 ), then the initial 
value problem (1) has at Least one solution. 

Theorem 2 says that if, moreover, the partial derivative df/dy of / with respect to y 
exists and is continuous in that region, then (1) can have at most one solution; hence, by 
Theorem 1, it has precisely one solution. 

Read again what you have just read — these are entirely new ideas in our discussion. 

Proofs of these theorems are beyond the level of this book (see Ref. [A1 1] in App. 1); 
however, the following remarks and examples may help you to a good understanding of 
the theorems. 

Since y' = /(a, y)> the condition (2) implies that \y'\ ^ K ; that is, the slope of any 
solution curve y(x) in R is at least -K and at most K. Hence a solution curve that passes 
through the point (x 0 , y 0 ) must lie in the colored region in Fig. 27 on the next page bounded 
by the lines / x and / 2 whose slopes are — K and K , respectively. Depending on the form 
of R, two different cases may arise. In the first case, shown in Fig. 27a, we have b/K ^ 
a and therefore a = a in the existence theorem, which then asserts that the solution exists 
for all .v between x 0 — a and x 0 + a. In the second case, shown in Fig. 27b, we have 
b/K < a. Therefore, a = b/K < a, and all we can conclude from the theorems is that the 
solution exists for all jc between x 0 - b/K and x 0 + b/K. For larger or smailer x’s the 
solution curve may leave the rectangle /?, and since nothing is assumed about f outside 
R > nothing can be concluded about the solution for those larger or smaller x’s; that is, for 
such x’s the solution may or may not exist — we don’t know. 
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EXAMPLE 1 




Fig. 27. The condition (2) of the existence theorem, (a) First case, (b) Second case 


Let us illustrate our discussion with a simple example. We shall see that our choice of 
a rectangle R with a large base (a long ^-interval) will lead to the case in Fig. 27b. 

Choice of a Rectangle 

Consider the initial value problem 


/ . . 2 
V = 1 + y 


.v(0) = 0 


and take the rectangle R : [x\ < 5, |v| < 3. Then a = 5, b = 3, and 

\f(x, y)| = |l +y 2 \ati= 10. 


«/ 

ay 


= 2|y| S M = 6, 


a = 


K 


0.3 < a. 


Indeed, the solution of the problem is y = tan* (see Sec. 1.3, Example 1). This solution is discontinuous at 
±7 t/ 2, and there is no continuous solution valid in the entire interval |.v| < 5 from which we started. H 

The conditions in the two theorems are sufficient conditions rather than necessary ones, and 
can be lessened. In particular, by the mean value theorem of differential calculus we have 


fix, y 2 ) ~ fix, }’i) = (v 2 - >>i) Y~ 

dy 


y—y 


where (a% y\) and (a, y 2 ) are assumed to be in R, and y is a suitable value between y x and 
y 2 - From this and (3b) it follows that 

( 4 ) \fix, y 2 ) ~ fix, ^i)| ^ M\y z - yj. 

It can be shown that (3b) may be replaced by the weaker condition (4), which is known 
as a Lipschitz condition . 7 However, continuity of /(a, y) is not enough to guarantee the 
uniqueness of the solution. This may be illustrated by the following example. 
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EXAMPLE 2 Nonuniqueness 

The initial value problem 


has the two solutions 


y = 0 


y' = Vbi. y(0) = o 


and 


.V* 


x a /4 if x s 0 
-x 2 /4 if x < 0 


although f{x, y ) = V|y| is continuous for all y. The Lipschitz condition (4) is violated in any region that includes 
the line y = 0, because for y x = 0 and positive y 2 we have 


(5) 


|/(-v. yz) - /(-v,.vr) | = = _l_ 

|y 2 - yil y 2 Vyl ’ 


(V^>0) 


and this can be made as large as we please by choosing y 2 sufficiently small, whereas (4) requires that the 
quotient on the left side of (5) should not exceed a fixed constant M. H 



1. (Vertical strip) If the assumptions of Theorems 1 and 2 
are satisfied not merely in a rectangle but in a vertical 
infinite strip |.v - a 0 | < a, in what interval will the 
solution of (1) exist? 

2. (Existence?) Does the initial value problem 
( x — 1)/ = 2>\ y(l) = 1 have a solution? Does your 
result contradict our present theorems? 

3. (Common points) Can two solution curves of the same 
ODE have a common point in a rectangle in which the 
assumptions of the present theorems are satisfied? 

4. (Change of initial condition) What happens in Prob. 2 
if you replace y(l) = 1 with y(l) = kl 

5. (Linear ODE) If p and r in y' + p(x)y = r(x) are 
continuous for all a* in an interval |a - a 0 | ^ a, show 
that f{x, y) in this ODE satisfies the conditions of our 
present theorems, so that a corresponding initial value 
problem has a unique solution. Do you actually need 
these theorems for this ODE? 

6. (Three possible cases) Find all initial conditions such 
that (a 2 — 4a)/ = (2a — 4)y has no solution, precisely 
one solution, and more than one solution. 

7. (Length of x-interval) In most cases the solution of an 
initial value problem (1) exists in an A-interval larger 
than that guaranteed by the present theorems. Show this 
fact for / = 2y 2 , y(l) = 1 by finding the best possible 
a (choosing b optimally) and comparing the result with 
the actual solution. 


8. PROJECT. Lipschitz Condition. (A) State the 
definition of a Lipschitz condition. Explain its relation 
to the existence of a partial derivative. Explain its 
significance in our present context. Illustrate your 
statements by examples of your own. 

(B) Show that for a linear ODEv' 4* p(x)y = r( a) with 
continuous p and r in [a - a 0 | =kci a Lipschitz condition 
holds. This is remarkable because it means that for a 
linear ODE the continuity of /(a, y) guarantees not only 
the existence but also the uniqueness of the solution of 
an initial value problem. (Of course, this also follows 
directly from (4) in Sec. 1.5.) 

(C) Discuss the uniqueness of solution for a few simple 
ODEs that you can solve by one of the methods 
considered, and find whether a Lipschitz condition is 
satisfied. 

9. (Maximum a) What is the largest possible a in 
Example 1 in the text? 

10. CAS PROJECT. Picard Iteration. (A) Show that by 
integrating the ODE in (1) and observing the initial 
condition you obtain 

(6) y(x) = v 0 + f f{t, y(t)) dt. 

*0 


7 RUDOLF LIPSCHITZ (1832-1903), German mathematician. Lipschitz and similar conditions are important 
in modem theories, for instance, in partial differential equations. 
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This form (6) of (1) suggests Picard’s iteration 
method 8 , which is defined by 

(7) y n (.v) = y 0 + f /(/, .Vn-iM) dt, n = 1, 2, • • . 

It gives approximations y lt y 2 , y& • • • of the unknown 
solution y of ( 1 ). Indeed, you obtain y\ by substituting 
v = y 0 on the right and integrating — this is the first 
step — , then y 2 by substituting y = y x on the right and 
integrating — this is the second step — , and so on. Write 
a program of the iteration that gives a printout of the 
first approximations y 0 , 3V * • • » }'n as well as then- 
graphs on common axes. Try your program on two 
initial value problems of your own choice. 


(B) Apply the iteration to y = x + y, y(0) = 0. Also 
solve the problem exactly. 

(C) Apply the iteration to y f = 2y 2 , y(0) = 1. Also 
solve the problem exactly. 

(D) Find all solutions of y = 2Vy, v( 1) = 0. Which 
of them does Picard’s iteration approximate? 

(E) Experiment with the conjecture that Picard’s 
iteration converges to the solution of the problem for 
any initial choice of y in the integrand in (7) (leaving 
y 0 outside the integral as it is). Begin with a simple 
ODE and see what happens. When you are reasonably 
sure, take a slightly more complicated ODE and give 
it a try. 


■i 
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E.WZQ1H S T IONS AND PROBLEMS 


1. Explain the terms ordinal y differential equation (ODE), 
partial differential equation (PDE), order , general 
solution , and particular solution . Give examples. Why 
are these concepts of importance? 

2. What is an initial condition? How is this condition used 
in an initial value problem? 

3. What is ahomogeneous linear ODE? A nonhomogeneous 
linear ODE? Why are these equations simpler than 
nonlinear ODEs? 

4. What do you know about direction fields and their 
practical importance? 

5. Give examples of mechanical problems that lead to ODEs. 

6. Why do electric circuits lead to ODEs? 

7. Make a list of the solution methods considered. Explain 
each method with a few short sentences and illustrate 
it by a typical example. 

8. Can certain ODEs be solved by more than one method? 
Give three examples. 

9. What are integrating factors? Explain the idea. Give 
examples. 

10. Does every first-order ODE have a solution? A general 
solution? What do you know about uniqueness of 
solutions? 


11-14 


DIRECTION FIELDS 


Graph a direction field (by a CAS or by hand) and sketch 
some of the solution curves. Solve the ODE exactly and 
compare. 


11. / = 1 + 4v 2 12. / =3 y - 2.x 


13. y 1 = 4 y — .v 2 14. y' = 16a/}’ 


1 15-26 1 GENERAL SOLUTION 

Find the general solution, indicate which method in this 
chapter you are using. Show the details of your work. 

15. y' = x 2 (l + y 2 ) 

16. y = x(y - a - 2 + 1) 

17. yy' + xy 2 = a 

18. 7r sin 7 ta* cosh 3y dx + 3 cos ttx sinh 3y dy = 0 

19. y f + y sin x = sin x 20. y f — y = 1/y 

21. 3 sin 2y dx + 2x cos 2y dy = 0 

22. xy' = a tan (y/x) + y 

23. (y cos xy — 2a*) dx + (x cos A*y 4* 2y) dy = 0 

24. xy r = (y “ 2a) 2 *F y (Set y — 2a = z.) 

25. sin (y - a) dx + [cos (y - a) - sin (y - a)] dy = 0 

26. Ay' = (y/A) 3 + y 
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INITIAL VALUE PROBLEMS 


Solve the following initial value problems. Indicate the 
method used. Show the details of your work. 

27. yy' + a = 0, y(3) = 4 

28. y' — 3y = — I2y 2 , y(0) = 2 

29. / = 1 + y 2 , y(\7r) - 0 

30. y' 4- iry — 2b cos 7ta, y(0) = 0 

31. (2a y 2 - sin x) dx + (2 + 2A 2 y) dy = 0, y(0) = l 

32. [2y + y 2 / x + ^(1 + 1/a)] dx + (a + 2y) dy = 0, 

yd) = 1 


S EMILE PICARD ( 1 856-1 94 1 ), French mathematician, also known for his important contributions to complex 
analysis (see Sec. 16.2 for his famous theorem). Picard used his method to prove Theorems I and 2 as well as 
the convergence of the sequence (7) to the solution of ( 1 ). In precomputer times die iteration was of little practical 
value because of the integrations. 
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APPLICATIONS, MODELING 

33. (Heat flow) If the isotherms in a region are .v 2 — y 2 = c, 
what are the curves of heat flow (assuming orthogonality)? 

34. (Law of cooling) A thermometer showing 10°C is 
brought into a room whose temperature is 25°C. After 
5 minutes it shows 20°C. When will the thermometer 
practically reach the room temperature, say, 24.9°C? 

35. (Half-life) If 1 0% of a radioactive substance disintegrates 
in 4 days, what is its half-life? 

36. (Half-life) What is the half-life of a substance if after 
5 days, 0.020 g is present and after 10 days, 0.015 g? 

37. (Half-life) When will 99% of the substance in Prob. 35 
have disintegrated? 

38. (Air circulation) In a room containing 20 000 ft 3 of 
air, 600 ft 3 of fresh air flows in per minute, and the 
mixture (made practically uniform by circulating fans) 
is exhausted at a rate of 600 cubic feet per minute 
(cfm). What is the amount of fresh air y(/) at any time 
if y(0) = 0? After what time will 90% of the air be 
fresh? 

39. (Electric field) If the equipotential lines in a region of 
the Ay-plane are 4a 2 + y 2 = c, what are the curves of 
the electrical force? Sketch both families of curves. 


40. (Chemistry) In a bimolecular reaction A + B M, 
a moles per liter of a substance A and b moles per liter 
of a substance B are combined. Under constant 
temperature the rate of reaction is 

y = k(a - y)(b - y) (Law of mass action); 

that is, y is proportional to the product of the 
concentrations of the substances that are reacting, where 
v(0 is the number of moles per liter which have reacted 
after time t. Solve this ODE, assuming that a ± b. 

41. (Population) Find the population y(f) if the birth rate is 
proportional to y(t) and the death rate is proportional to 
the square of y(/). 

42. (Curves) Find all curves in the first quadrant of the Ay- 
plane such that for every tangent, the segment between 
the coordinate axes is bisected by the point of tangency. 
(Make a sketch.) 

43. (Optics) Lambert’s law of absorption 9 states that the 
absorption of light in a thin transparent layer is 
proportional to the thickness of the layer and to the 
amount of light incident on that layer. Formulate this 
law as an ODE and solve it. 




First-Order ODEs 


This chapter concerns ordinary differential equations (ODEs) of first order and 
their applications. These are equations of the form 

(1) F(x , y, y) = 0 or in explicit form y f = f(x, y) 

involving the derivative y = dy/dx of an unknown function y, given functions of 
a\ and, perhaps, y itself. If the independent variable x is time, we denote it by t. 

In Sec. 1 . 1 we explained the basic concepts and the process of modeling, that is, 
of expressing a physical or other problem in some mathematical form and solving 
it. Then we discussed the method of direction fields (Sec. 1.2), solution methods 
and models (Secs. 1.3-1 .6), and, finally, ideas on existence and uniqueness of 
solutions (Sec. 1.7). 


®JOHANN HEINRICH LAMBERT (1728-1777), German physicist and mathematician. 
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A first-order ODE usually has a general solution, that is, a solution involving an 
arbitrary constant, which we denote by c. In applications we usually have to find a 
unique solution by determining a value of c from an initial condition y(x 0 ) = y 0 - 
Together with the ODE this is called an initial value problem 

(2) y = f(x, y), y(A' 0 ,) = y 0 (x 0 , y 0 given numbers) 

and its solution is a particular solution of the ODE. Geometrically, a general 
solution represents a family of curves, which can be graphed by using direction 
fields (Sec. 1.2). And each particular solution corresponds to one of these curves. 
A separable ODE is one that we can put into the form 

(3) g(y) dy = f(x) dx (Sec. 1.3) 

by algebraic manipulations (possibly combined with transformations, such as ylx = u) 
and solve by integrating on both sides. 

An exact ODE is of the form 

(4) M{x, y) dx + N(x, y) dy = 0 (Sec. 1.4) 

where M dx 4* N dy is the differential 

du = u x dx 4- u y dy 

of a function u(x , y). so that from du = 0 we immediately get the implicit general 
solution u(x , y) = c. This method extends to nonexact ODEs that can be made exact 
by multiplying them by some function F(x , y), called an integrating factor (Sec. 1 .4). 
Linear ODEs 

(5) / + p(x)y = r(x) 

are very important. Their solutions are given by the integral formula (4), Sec. 1.5. 
Certain nonlinear ODEs can be transformed to linear form in terms of new variables. 
This holds for the Bernoulli equation 

/ + p(x)y = g(x)y a (Sec. 1.5). 

Applications and modeling are discussed throughout the chapter, in particular in 
Secs. 1.1, 1.3, 1.5 (population dynamics, etc.), and 1.6 (trajectories). 

Picard’s existence and uniqueness theorems are explained in Sec. 1.7 (and 
Picard's iteration in Problem Set 1.7). 

Numeric methods for first-order ODEs can be studied in Secs. 21.1 and 21.2 
immediately after this chapter, as indicated in the chapter opening. 





CHAPTER 2 

Second-Order Linear ODEs 


Ordinary differential equations (ODEs) may be divided into two large classes, linear 
ODEs and nonlinear ODEs. Whereas nonlinear ODEs of second (and higher) order 
generally are difficult to solve, linear ODEs are much simpler because various properties 
of their solutions can be characterized in a general way, and there are standard methods 
for solving many of these equations. 

Linear ODEs of the second order are the most important ones because of their 
applications in mechanical and electrical engineering (Secs. 2.4, 2.8, 2.9). And their theory 
is typical of that of all linear ODEs, but the formulas are simpler than for higher order 
equations. Also the transition to higher order (in Chap. 3) will be almost immediate. 

This chapter includes the derivation of general and particular solutions, the latter in 
connection with initial value problems. 

(Boundary value problems follow in Chap. 5, which also contains solution methods for 
Legendre’s, Bessel’s, and the hypergeometric equations.) 

COMMENT. Numerics for second-order ODEs can be studied immediately after this 
chapter \ See Sec. 21.3, which is independent of other sections in Chaps. 19-21. 

Prerequisite: Chap. 1, in particular. Sec. 1.5. 

Sections that may be omitted in a shorter course: 2.3, 2.9, 2.10. 

References and Answers to Problems: App. 1 Part A, and App. 2. 


2.1 Homogeneous Linear ODEs of Second Order 

We have already considered first-order linear ODEs (Sec. 1.5) and shall now define and 
discuss linear ODEs of second order. These equations have important engineering 
applications, especially in connection with mechanical and electrical vibrations (Secs. 2.4, 
2.8, 2.9) as well as in wave motion, heat conduction, and other parts of physics, as we 
shall see in Chap. 12. 

A second-order ODE is called linear if it can be written 
(1) y” + p(x)y' + q(x)y = r(x) 


and nonlinear if it cannot be written in this form. 

The distinctive feature of this equation is that it is linear in y and its derivatives, whereas 
the functions p , q , and r on the right may be any given functions of jc. If the equation 
begins with, say, f(x)y'\ then divide by f(x) to have the standard form (1) with y" as 
the first term, which is practical. 
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CHAP. 2 Second-Order Linear ODEs 


EXAMPLE 1 


If r(x) = 0 (that is, r(x) = 0 for all x considered; read “r(x) is identically zero”), then 

( 1 ) reduces to 

(2) y” + p(x)y' + cj(x)y = 0 

and is called homogeneous. If r(.v) ^ 0, then (1) is called nonhomogeneous. This is 
similar to Sec. 1.5. 

For instance, a nonhomogeneous linear ODE is 

y" -1- 25y = e~ x cos x, 
and a homogeneous linear ODE is 

xy" -I- y f 4- xy = 0, in standard form y N 4- — y' 4- y = 0. 

An example of a nonlinear ODE is 


y"y + y' 2 = 0. 

The functions p and q in (1) and (2) are called the coefficients of the ODEs. 
Solutions are defined similarly as for first-order ODEs in Chap. 1. A function 

y = AW 

is called a solution of a (linear or nonlinear) second-order ODE on some open interval I 
if /? is defined and twice differentiable throughout dial interval and is such that the ODE 
becomes an identity if we replace the unknown y by /z, the derivative y by h f , and the 
second derivative y” by /z". Examples are given below. 

Homogeneous Linear ODEs: Superposition Principle 

Sections 2. 1-2.6 will be devoted to homogeneous linear ODEs (2) and the remaining 
sections of the chapter to nonhomogeneous linear ODEs. 

Linear ODEs have a rich solution structure. For the homogeneous equation the backbone 
of this structure is the superposition principle or linearity principle , which says that we 
can obtain further solutions from given ones by adding them or by multiplying them with 
any constants. Of course, diis is a great advantage of homogeneous linear ODEs. Let us 
first discuss an example. 

Homogeneous Linear ODEs: Superposition of Solutions 

The functions v — cos x and v = sin .v are solutions of the homogeneous linear ODE 

v + v = 0 

for all .v. We verify this by differentiation and substitution. We obtain (cos xf = — cos.v; hence 
y n + y = (cos .v) ,/ + cos .v = —cos .v + cos .v = 0. 

Similarly for y = sin x (verily!). We can go an important step further. We multiply cos.v by any constant, for 
instance, 4.7, and sin.v by, say, —2, and take the sum of the results, claiming that it is a solution. Indeed, 
differentiation and substitution gives 

(4.7 cos x - 2 sin xf + (4.7 cos x - 2 sin x) = -4.7 cos .v + 2 sin .v + 4.7 cos .v - 2 sin x = 0. ■ 
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In this example we have obtained from y t (= cos x) and y 2 (= sin x) a function of the form 

(3) y = c x y x + c 2 y 2 (c 1? c 2 arbitrary constants). 

This is called a linear combination of y x and y 2 . In terms of this concept we can now 
formulate the result suggested by our example, often called the superposition principle 
or linearity principle. 


THEOREM 1 


Fundamental Theorem for the Homogeneous Linear ODE (2) 

For a homogeneous linear ODE (2), any linear combination of two solutions on an 
open intetval / is again a solution of (2) on /. In particular , for such an equation , 
sums and constant multiples of solutions are again solutions . 


PROOF Let y x and y 2 be solutions of (2) on /. Then by substituting y = c x y x 4* c 2 y 2 and its 
derivatives into (2), and using the familiar rule (CiJx 4- c 2 y 2 ) r = + c 2 y 2 , etc., we 

get 


y" + py' + qy - (c^ + c 2 y 2 )" + p(c + c 2 y 2 )' + q{ciy\ + c 2 y 2 ) 

= Cl}’" + c 2 y 2 + p(c x y[ + c 2 y 2 ) + qic^ + c 2 y 2 ) 

= Ci(y" + py'i + qy i) + c 2 (y 2 + py 2 + qy 2 ) = o, 

since in the last line, (*••) = 0 because y x andy 2 %re solutions, by assumption. This shows 
that y is a solution of (2) on I. ■ 

CAUTION! Don’t forget that this highly important theorem holds for homogeneous 
linear ODEs only but does not hold for nonhomogeneous linear or nonlinear ODEs, as 
the following two examples illustrate. 

EXAMPLE 2 A Nonhomogeneous Linear ODE 

Verify by substitution that the functions y = 1 + cos x and y = 1 + sin .v are solutions of the nonhomogeneous 
linear ODE 

y* + y = 1. 

but their sum is not a solution. Neither is, for instance, 2(1 -1- cos.v) or 5(1 + sin.v). M 

EXAMPLE 3 A Nonlinear ODE 

Verily by substitution that the functions y = .v 2 and y = 1 are solutions of the nonlinear ODE 

tt / 

V v _ ,vv = o, 

but their sum is not a solution. Neither is — .v 2 , so you cannot even multiply by — 1 ! I 


Initial Value Problem. Basis. General Solution 

Recall from Chap. 1 that for a first-order ODE, an initial value problem consists of the 
ODE and one initial condition y{x 0 ) = y 0 . The initial condition is used to determine the 
arbitrary constant c in the general solution of the ODE. This results in a unique solution, 
as we need it in most applications. That solution is called a particular solution of the 
ODE. These ideas extend to second-order equations as follows. 
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EXAMPLE 4 


For a second-order homogeneous linear ODE (2) an initial value problem consists of 
(2) and two initial conditions 

(4) y{x o) = K 0f y'(x Q ) = K v 

These conditions prescribe given values K 0 and K x of the solution and its first derivative 
(the slope of its curve) at the same given x = x 0 in the open interval considered. 

The conditions (4) are used to determine the two arbitrary constants c x and c 2 in a 

general solution 

(5) y = c x y x + c 2 y 2 


of the ODE; here, y x and y 2 are suitable solutions of the ODE, with “suitable” to be 
explained after the next example. This results in a unique solution, passing through the 
point (a* 0 , K 0 ) with K x as the tangent direction (the slope) at that point. That solution is 
called a particular solution of the ODE (2). 

Initial Value Problem 

Solve the initial value problem 

y" + y = 0, ,v(0) = 3.0, /(0) = -0.5. 

Solution. Step 1. General solution. The functions cos .t and sin .v are solutions of the ODE (by Example 
I ), and we take 

y = c x cos a- 4- c 2 sin x. 

This will turn out to be a general solution as defined below. 

Step 2. Particular solution . We need the derivative y = —c 1 sin x + c 2 cos x. From this and the initial values 
we obtain, since cos 0 = I and sin 0 = 0, 


y(0) = Cl = 3.0 and y'(0) = c 2 = -0.5. 

This gives as the solution of our initial value problem the particular solution 

y = 3.0 cos a — 0.5 sin a. 

Figure 28 shows that at a = 0 it has the value 3.0 and the slope -0.5, so that its tangent intersects the A-axis 
at a = 3.0/0.5 = 6.0. (The scales on the axes differ!) M 



Fig. 28. Particular solution and initial tangent in Example 4 


Observation. Our choice of y x and y 2 was general enough to satisfy both initial 
conditions. Now let us take instead two proportional solutions y x = cosjc and 
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v 2 = k cos x, so that y 1 ly 2 = \/k = const. Then we can write y = + c 2 V 2 in the 

form 


y = Ci cos x + c 2 (k cos x) = C cos x where C = + c 2 k. 

Hence we are no longer able to satisfy two initial conditions with only one arbitrary 
constant C. Consequently, in defining the concept of a general solution, we must exclude 
proportionality. And we see at the same time why the concept of a general solution is of 
importance in connection with initial value problems. 


DEFINITION 


General Solution, Basis, Particular Solution 

A general solution of an ODE (2) on an open interval 7 is a solution (5) in which 
yi and y 2 are solutions of (2) on 7 that are not proportional, and c± and c 2 are arbitrary 
constants. These y lf y 2 are called a basis (or a fundamental system) of solutions 
of (2) on /. 

A particular solution of (2) on 7 is obtained if we assign specific values to c x 
and c 2 in (5). 


For the definition of an internal see Sec. 1.1. Also, c 1 and c 2 must sometimes be restricted 
to some interval in order to avoid complex expressions in the solution. Furthermore, as 
usual, y x and y 2 are called proportional on 7 if for all x on 7, 

(6) (a) y 1 = ky 2 or (b) y 2 = fyi 

where k and / are numbers, zero or not. (Note that (a) implies (b) if and only if k =£ 0). 

Actually, we can reformulate our definition of a basis by using a concept of general 
importance. Namely, two functions y x and y 2 are called linearly independent on an 
interval 7 where they are defined if 

(7) kiy x (jc) + k 2 y 2 (x) = 0 everywhere on 7 implies k x = 0 and k 2 - 0. 

And y x and y 2 are called linearly dependent on 1 if (7) also holds for some constants 
k 2 not both zero. Then if k x =£ 0 or k 2 ^ 0, we can divide and see that y x and y 2 are 
proportional, 

k 2 ki 

= ~ T” )’2 or y 2 = - — yi. 

Ki K 2 

In contrast, in the case of linear independence these functions are not proportional because 
then we cannot divide in (7). This gives the following 


DEFINITION 


Basis (Reformulated) 

A basis of solutions of (2) on an open interval 1 is a pair of linearly independent 
solutions of (2) on 7. 


If the coefficients p and q of (2) are continuous on some open interval /, then (2) has a 
general solution. It yields the unique solution of any initial value problem (2), (4). It 
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EXAMPLE 5 


EXAMPLE 6 


EXAMPLE 7 


includes all solutions of (2) on /; hence (2) has no singular solutions (solutions not 
obtainable from of a general solution; see also Problem Set 1.1). All this will be shown 
in Sec. 2.6. 

Basis, General Solution, Particular Solution 

cos a* and sin.v in Example 4 form a basis of solutions of the ODE y" 4y = 0 for all x because their quotient 
is cot a* ^ const (or tan a* =r= const). Hence y = c* cos a 4* c 2 sin a* is a general solution. The solution 
y = 3.0 cos a* - 0.5 sin x of the initial value problem is a particular solution. ■ 

Basis, General Solution, Particular Solution 

Verify by substitution that = e x and y 2 = e~ x are solutions of the ODE y" - y = 0. Then solve the initial 

value problem 

y"-y = 0, y(0) = 6, y'(0) = -2. 

Solution, (e*)" - e x = 0 and {e~ x f — e~ x = 0 shows that e x and e~ x are solutions. They are not 
proportional. e x !c~ x = e 2 * const. Hence e x , e~ x form a basis for all a*. We now write down the corresponding 
general solution and its derivative and equate their values at 0 to the given initial conditions. 

y = c x e x -1- c 2 e~ x . j/ = c t e x — c 2 e~ x , v( 0) = c l 4 c 2 = 6. y'(0) = C’i — c 2 = -2. 

By addition and subtraction, c x = 2. c 2 = 4, so that the answer is y = 2e x 4 4e~ x . This is the particular solution 
satisfying the two initial conditions. ■ 

Find a Basis if One Solution Is Known. 

Reduction of Order 

It happens quite often that one solution can be found by inspection or in some other way. 
Then a second linearly independent solution can be obtained by solving a first-order ODE. 
This is called the method of reduction of order . 1 We first show this method for an example 
and then in general. 

Reduction of Order if a Solution Is Known. Basis 

Find a basis of solutions of the ODE 


(.V 2 - x)y" - xy' + y = 0. 

Solution. Inspection shows that yi = x is a solution because = 1 and y" = 0, so that the first term 
vanishes identically and the second and third terms cancel. The idea of the method is to substitute 

y = tty i = ux. / = ux 4 //, y" = ux 4 2 u 

into the ODE. This gives 

(.v 2 — a)(//a 4 hi) — x{u x 4 n) 4 nx = 0. 

ux and —x it cancel and we are left with the following ODE, which we divide by a\ order, and simplify, 

(a - 2 — .v)(i/a* 4 2u) — x 2 u = 0, (a - 2 “ a)m" 4 {x - 2)u f = 0. 


Credited to the great mathematician JOSEPH LOUIS LAGRANGE (1736-1813), who was born in Turin, 
of French extraction, got his first professorship when he was 1 9 (at the Military Academy of Turin), became 
director of the mathematical section of the Berlin Academy in 1766. and moved to Paris in 1787. His important 
major work was in the calculus of variations, celestial mechanics, general mechanics {Mdcanique anatytique, 
Paris, 1788), differential equations, approximation theory, algebra, and number theory. 



SEC 2.1 Homogeneous Linear ODEs of Second Order 


51 


This ODE is of first order in v = u , namely, (* 2 — x)v* 4 (a- - 2)v = 0. Separation of variables and integration 
gives 

dv x - 2 / 1 2 \ \x — l| 

— = ^v= f - jjdx , In \v\ = In \x - l| - 2 In |*| = In . 

We need no constant of integration because we want to obtain a particular solution; similarly in the next 
integration. Taking exponents and integrating again, we obtain 

v = — 2 ~ = — . u = Jv dx = In |*| 4- -j . hence y 2 = ux = jc In \x\ 4* 1. 

Since y x = a- and y 2 = x In |.v| + 1 are linearly independent (their quotient is not constant), we have obtained 
a basis of solutions, valid for all positive jc. M 


In this example we applied reduction of order to a homogeneous linear ODE [see (2)] 

y" + p(x)y* + q(*)y = o. 

Note that we now take the ODE in standard form, with y n , not f(x)y " — this is essential 
in applying our subsequent formulas. We assume a solution y 1 of (2) on an open interval 
/ to be known and want to find a basis. For this we need a second linearly independent 
solution y 2 of (2) on /. To get y 2 , we substitute 


y = j '2 = «)’i> y' = yL = u'yi + uyi y" = y'l = u"y x + iu'y[ + uy'{ 


into (2). This gives 

(8) u n y 1 + 2 uy[ + uy'[ 4* p(uy ± + uy'i) + quy x = 0. 

Collecting terms in u , and u , we have 

uy x + u'(2y[ + py x ) + u{y'{ + py[ + qy x ) = 0. 

Now comes the main point. Since y 1 is a solution of (2), the expression in the last 
parentheses is zero. Hence u is gone, and we are left with an ODE in u and u". We divide 
this remaining ODE by y x and set u = U, u = (/, 

u" + u 2 - Vl + pyi = 0, thus U' + [ — + i?) U = 0. 
yi \yi / 

This is the desired first-order ODE, the reduced ODE. Separation of variables and 
integration gives 

dx and In |£/| = —2 In |yj] —Jpdx. 

By taking exponents we finally obtain 



U = 


-fp dx 


yi 


( 9 ) 
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Here U = w\ so that u = Jl/ dx. Hence the desired second solution is 

)'2 = J’i« = )'i / Udx. 

The quotient y 2 /yi = u = fU dx cannot be constant (since U > 0), so that y x and y 2 form 
a basis of solutions. 



[mT] general solution, initial value 

PROBLEM 

(More in the next problem set.) Verify by substitution that 
the given functions form a basis. Solve the given initial 
value problem. (Show the details of your work.) 

1. y" - I6.v = 0, e* x , e~ 4x , y(0) = 3, /( 0) = 8 

2. y" + 25 y = 0, cos 5x, sia5.v, y(0) = 0.8, 
/(0)='-6.5 

3. y" + 2y + 2y = 0, e~ x cosx, e~ x sinv, 

>•(0) = l,/(0) = -1 

4. y" - 6y' + 9y = 0, e 3x , xe :,x , y(0) = -1.4, 
>•'(0) = 4.6 

5. x 2 y" + xy' - Ay = 0, a- 2 , a:" 2 , y(l) = 11, 

.v'(l) = "6 

6. vV' - 7 xy' + I5v = 0, a- 3 , .v s , v(l ) = 0.4, 

/(D= 1.0 

1 7-14 | UNEAR INDEPENDENCE AND DEPENDENCE 

Are the following functions linearly independent on the 
given interval? 

7. a*, a In x (0 < a* < 10) 

8. 3a 2 , 2a n (0 < a < 1) 

9. e ax , e~ ax (any interval) 

10. cos 2 a, sin 2 a (any interval) 

11. In a. In a 2 (a > 0) 

12. x — 2, a “4-2( — 2 <a< 2) 

13. 5 sin a cos a, 3 sin 2 a (a > 0) 

14. 0, sinh ttx (a > 0) 

REDUCTION OF ORDER is important because it gives a 
simpler ODE. A second-order ODE F( x, y, y\y H ) = 0, linear 
or not, can be reduced to first order if y does not occur 
explicitly (Prob. 15) or if a does not occur explicitly (Prob. 
16) or if the ODE is homogeneous linear and we know a 
solution (see the text). 

15. (Reduction) Show that F( a, y\y") = 0 can be reduced 
to first order in z = y (from which y follows by 
integration). Give two examples of your own. 

16. (Reduction) Show that F(y, y r , y") = 0 can be reduced 
to a first-order ODE with y as the independent variable 
and y" = (dz!dy)z, where z = y \ derive this by the 
chain rule. Give two examples. 


|n-22j 

step in detail). 

17. y” = ky' 


Reduce to first order and solve (showing each 


18. /' = 1 + y' 2 

19. yy" = 4/ 2 


20. xy" + 2y ' + xy = 0, = v 1 cosv 

21. y" + y' 3 sin.v = 0 

22. (1 - v 2 )y" - 2 xy' + 2y = 0, >’i = x 


23. (Motion) A small body moves on a straight line. Its 
velocity equals twice the reciprocal of its acceleration. 
If at t = 0 the body has distance I m from the origin 
and velocity 2 m/sec, what are its distance and velocity 
after 3 sec? 

24. (Hanging cable) It can be shown that the curve >(a) 
of an inextensible flexible homogeneous cable 
hanging between two fixed points is obtained by 

solving /' = 1 +y 2 , where the constant k depends 

on the weight. This curve is called a ccnenaiy (from 
Latin catena = the chain). Find and graph >»(a), 
assuming k — 1 and those fixed points are (— 1, 0) and 
(1, 0) in a vertical Ay-plane. 

25. (Curves) Find and sketch or graph the curves passing 
through the origin with slope 1 for which the second 
derivative is proportional to the first. 

26. WRITING PROJECT. General Properties of 
Solutions of Linear ODEs. Write a short essay (with 
proofs and simple examples of your own) that includes 
the following. 

(a) The superposition principle. 

(b) .y = 0 is a solution of the homogeneous equation 
(2) (called the trivial solution). 

(c) The sum y = yj + y 2 of a solution y t of (1) and 
y 2 of (2) is a solution of (1). 

(d) Explore possibilities of making further general 
statements on solutions of (1) and (2) (sums, 
differences, multiples). 

27. CAS PROJECT. Linear Independence. Write a 
program for testing linear independence and 
dependence. Try it out on some of the problems in this 
problem set and on examples of your own. 
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2u Homogeneous Linear ODEs 
with Constant Coefficients 

We shall now consider second-order homogeneous linear ODEs whose coefficients a and 
b are constant, 

(1) y" + ay' + by — 0. 

These equations have important applications, especially in connection with mechanical 
and electrical vibrations, as we shall see in Secs. 2.4, 2.8, and 2.9. 

How to solve (1)? We remember from Sec. 1.5 that the solution of the first-order linear 
ODE with a constant coefficient k 

/ + = 0 

is an exponential function y = ce ~ kx . This gives us the idea to try as a solution of (l) the 
function 

(2) y = e kx . 

Substituting (2) and its derivatives 

y 9 = ke kx and y" = k 2 e kx 
into our equation (1), we obtain 

(A 2 + ak + b)e kx = 0. 

Hence if A is a solution of the important characteristic equation (or auxiliary equation) 

(3) A 2 + ak + b = 0 

then the exponential function (2) is a solution of the ODE (1). Now from elementary 
algebra we recall that the roots of this quadratic equation (3) are 

(4) A x = \{—a + Vfl 2 — 4b), A 2 = |(— a — Va 2 — 4b). 

(3) and (4) will be basic because our derivation shows that the functions 

(5) y x = e kxX and y 2 = r 

are solutions of (1). Verify this by substituting (5) into (1). 

From algebra we further know that the quadratic equation (3) may have three kinds of 
roots, depending on the sign of the discriminant a 2 — 4b , namely, 

(Case I) Two real roots if a 2 — 4b > 0, 

(Case II) A real double root if a 2 — 4b = 0, 

(Case HI) Complex conjugate roots if a 2 — 4b < 0. 



54 


CHAP. 2 Second-Order Linear ODEs 


EXAMPLE I 


EXAMPLE 2 


Case I. Two Distinct Real Roots and A 2 

In this case, a basis of solutions of (1) on any interval is 

y x = e XlX and y 2 = e x * x 

because y x and y 2 are defined (and real) for all x and their quotient is not constant. The 
corresponding general solution is 

(6) y = Cl e x ' x + 


General Solution in the Case of Distinct Real Roots 

We can now solve y" - y = 0 in Example 6 of Sec. 2.1 systematically. The characteristic equation is 
A 2 — 1 =0. Its roots are A x = 1 and A 2 = — I. Hence a basis of solutions is e x and e~ x and gives the same 
general solution as before. 


v = c x e x + c 2 e x . H 

Initial Value Problem in the Case of Distinct Real Roots 

Solve the initial value problem 

y" + y - 2y = 0, y(0) = 4, /(0) = -5. 

Solution. Step 1. General solution. The characteristic equation is 

A 2 + A - 2 = 0. 

Its roots are 

Ai = s(“l + V?) = I and A 2 = |(-1 - V9) = -2 
so that we obtain the general solution 

X I —%x 

y = c\e -I- c 2 e 

Step 2. Particular solution. Since y{x) = cic x - 2 c 2 c _2a; , we obtain from the general solution and the initial 
conditions 

y(0) = c x + c 2 = 4. 
y ; (0) = c x - 2c 2 = -5. 

Hence c x = I and c 2 - 3. This gives the answer y = e x + 3e~ 2x . Figure 29 shows that the curve begins at 
y = 4 with a negative slope (-5, but note that the axes have different scales!), in agreement with the initial 
conditions. ■ 



Fig. 29. Solution in Example 2 
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EXAMPLE 3 


EXAMPLE 4 


Case II. Real Double Root A = —a/2 

If the discriminant a 2 — 4 b is zero, we see directly from (4) that we get only one root, 
A = Ax = A 2 = —afl, hence only one solution, 

y x = <r (a/2) * 

To obtain a second independent solution y 2 (needed for a basis), we use the method of 
reduction of order discussed in the last section, setting y 2 = m.Vi* Substituting this and its 
derivatives y 2 = it'y l + uy[ and y 2 into (1), we first have 

(u'yi 4- 2 uy[ -1- uy") 4- a(u'y 1 4- uy[) 4- buy i = 0. 

Collecting tei*ms in u", it', and u, as in the last section, we obtain 

uyi 4 u\2y[ 4 ay x ) 4 u{y'[ 4 ciy[ 4 by x ) = 0. 

The expression in the last parentheses is zero, since y x is a solution of (1). The expression 
in the first parentheses is zero, too, since 

2y( = -cte-™ 12 = -ay v 

We are thus left with u"y 1 = 0. Hence u" = 0. By two integrations, u = c x x 4 c 2 . To 
get a second independent solution y 2 = uy l9 we can simply choose c a = 1, c 2 = 0 and 
take u = jc. Then y 2 = xy x . Since these solutions are not proportional, they form a basis. 
Hence in the case of a double root of (3) a basis of solutions of (1) on any interval is 

e~ axl2 , xe ~ ax/ 2 . 

The corresponding general solution is 

(7) y = (.c 1 + c 2 x)e- ax/2 . 

Warning. If A is a simple root of (4), then (c x 4 c 2 x)e* x with c 2 0 is not a solution 
of CD- 

General Solution in the Case of a Double Root 

The characteristic equation of the ODE y* 4- 6/ + 9y = 0 is A 2 + 6A + 9 = (A + 3) 2 = 0. It has the double 
root A = —3. Hence a basis is and xe~^ x . Tlie corresponding general solution is y = (cj + c 2 x)e~ Sx . M 

Initial Value Problem in the Case of a Double Root 

Solve the initial value problem 

/ -i- / + 0.25.V = 0, ,v(0) = 3.0, .v'CO) = -3.5. 

Solution . The characteristic equation is A 2 + A + 0.25 = (A + 0.5) 2 = 0. It has the double root A = —0.5. 
This gives the general solution 

v = (c x + c 2 x)e~°'^ x . 

We need its derivative 

/ = c z e~ 0 5x - 0.5(cj 4 * c 2 .x)e~ 0 5x . 
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EXAMPLE 5 


From this and the initial conditions we obtain 


.v(0) = c x = 3.0. v'(0) = c 2 - 0.5 Ci = -3.5; hence c 2 = -2. 

The particular solution of the initial value problem is y = (3 — 2v)^" 0,5,r . See Fig. 30. I 



Fig. 30. Solution in Example 4 


Case III. Complex Roots — \a + icj and — \a — i(o 

This case occurs if the discriminant a 2 — 4b of the characteristic equation (3) is negative. 
In this case, the roots of (3) and thus the solutions of the ODE (1) come at first out 
complex. However, we show that from them we can obtain a basis of real solutions 

(8) Vi = e~ axt2 cos (ox , y 2 = e~ axl2 sin cox ( to > 0) 

where co 2 = b - \a 2 . It can be verified by substitution that these are solutions in the 
present case. We shall derive them systematically after the two examples by using the 
complex exponential function. They form a basis on any interval since their quotient 
cot cox is not constant. Hence a real general solution in Case III is 

(9) y = <? -aar/2 (A cos (ox + B sin (ox) (A, B arbitrary). 

Complex Roots. Initial Value Problem 

Solve die initial value problem 


v" -f 0.4y # + 9.04y = 0, y(0) = 0, y'(0) = 3. 

Solution . Step L General solution. The characteristic equation is A 2 + 0.4A + 9.04 = 0. It has the roots 
-0.2 ± 3/. Hence o) = 3, and a general solution (9) is 

v = e~°' 2x (A cos 3.v + B sin 3 a). 

Step 2. Particular solution . The first initial condition gives y(0) = 4=0. The remaining expression is 
y = Be~ 0,2x sin 3.v. We need die derivative (chain rule!) 

y r = B(-0.2e~ O 2x sin 3 a* + 3e~°' 2x cos 3.v). 

From this and the second initial condition we obtain y'(0) = 3B = 3. Hence 5=1. Our solution is 

y = e~ 0 2x sin 3.v. 

Figure 31 shows y and die curves of e~ 0,2x and —e~ 0 2x (dashed), between which the curve of y oscillates. 
Such “damped vibrations'* (with a* = / being time) have important mechanical and electrical applications, as we 
shall soon see (in Sec. 2.4). M 
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EXAMPLE 6 



Complex Roots 

A general solution of the ODE 


is 


H . 2 n> 

y 4- (o y = 0 


y = A cos wi + B sin coa\ 


With a) = I this confirms Example 4 in Sec. 2. 1 . 


{cd constant, not zero) 


Summary of Cases l-lll 


Case 

Roots of (2) 

Basis of (1) 

General Solution of (1) 

I 

Distinct real 
Ai, A 2 


y - c\e XlX + c 2 e x * x 

n 

Real double root 
A = a 

q o,x/2 <ixl2 

y = (Ci + c 2 x)e~ ax,z 

m 

Complex conjugate 
A x = -\ci + ico, 
A 2 = —\a — i(o 

e -ax/2 CQS 
e -ax!2 s j n 

y = e~ ax/2 (A cos (ox + B sin wx) 


It is very interesting that in applications to mechanical systems or electrical circuits, 
these three cases correspond to three different forms of motion or flows of current, 
respectively. We shall discuss this basic relation between theory and practice in detail in 
Sec. 2.4 (and again in Sec. 2.8). 

Derivation in Case III. Complex Exponential Function 

If verification of the solutions in (8) satisfies you, skip the systematic derivation of these 
real solutions from the complex solutions by means of the complex exponential function 
e z of a complex variable z — r 4- It. We write r 4- /r, not x 4- iy because x and y occur 
in the ODE. The definition of e z in terms of the real functions <? r , cos t , and sin t is 


( 10 ) 


e z = e r * u = e r e lt = e 7 ’(cos t + i sin t). 
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This is motivated as follows. For real z = r, hence t = 0, cos 0 = 1, sin 0 = 0, we get 
the real exponential function e r . It can be shown that e Zl+Z2 = e* 1 ^ 2 , just as in real. (Proof 
in Sec. 13.5.) Finally, if we use the Maclaurin series of e z with z = it as well as i 2 = —1, 
/ 3 = — /, / 4 = 1, etc., and reorder the terms as shown (this is permissible, as can be proved), 
we obtain the series 




2! 


3! 


3 (iff (it) 5 
+ ^ ~ + 


4! 


5! 


t 2 t 4 

" 1_ * + ^~ + 


/ t 3 t 5 \ 

+ / r 5T + ir- + --j 


= cos t -I- / sin t. 


(Look up these real series in your calculus book if necessary.) We see that we have obtained 
the formula 

(11) e zt = cos t -f i sin r, 


called the Euler formula. Multiplication by e r gives (10). 

For later use we note that e~ lt = cos (—0 + i sin (— t) = cost — i sin /, so that by 
addition and subtraction of this and ( 11 ), 

(12) cos / = — - (e zt 4- e~ il ), sin r = ^7 ( e xt - e~ u ). 

After these comments on the definition (10), let us now turn to Case 111. 

In Case El the radicand a 2 — 4 b in (4) is negative. Hence 4 b — a 2 is positive and, 
using V— 1 = /, we obtain in (4) 

!Va 2 - 4b = §V-(4 b - a 2 ) = V -(b -±a 2 ) = iVb - \a 2 = too 
with co defined as in ( 8 ). Hence in (4), 

Ai = \a + ico and, similarly, A 2 = — i<o. 

Using (10) with r = — and / = <ox> we thus obtain 

= ^-(a/2).T+to>ar _ £-<a/2)^ c()S ^ 4. / s j n 
= e -(alZ)x-uox — £“< a / 2 >*( cos — / $in (OX). 

We now add these two lines and multiply the result by This gives y 1 as in (8). Then 
we subtract the second line from the first and multiply the result by 1/(2/). This gives y 2 
as in (8). These results obtained by addition and multiplication by constants are again 
solutions, as follows from the superposition principle in Sec. 2.1. This concludes the 
derivation of these real solutions in Case IH. 
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rl»EO^EEIiiEr£E3^2^ 3ZZZ: 


1 1—14 1 GENERAL SOLUTION 

Find a general solution. Check your answer by substitution. 

1. y" - 6/ - 7 y = 0 

2. 10/ - 7/ + l.2v = 0 

3. 4y" - 20/ + 25y = 0 

4. y" + Ait y' + 4 ir 2 y = 0 

5. 100)>" + 20 y - 99)’ = 0 

6. y" + 2y' + 5)' = 0 

7. )’" - / + 2.5 y = 0 

8. )>" + 2.6/ + 1 .69 y = 0 

9. )-" - 2/ - 5.25)’ = 0 

10. )’" - 2y = 0 

11. )’" + 9 ir 2 y = 0 12. y" + 2Ay' + 4.0y = 0 

13. )’" - 1 44 v = 0 14. y" + / - 0.96.V = 0 


15-20 


FIND ODE 


Find an ODE y" + ay' 
15. <? 2r , 

17. e xVS , xe xs/ * 

19. e 4x , e~ 4x 


+ by = 0 for the given basis. 
16. e 0 5x , e~ 3 5x 
18. 1, e~ 3x 
20. + <l + i)* 


21-32 


INITIAL VALUE PROBLEMS 


Solve the initial value problem. Check that your answer 
satisfies the ODE as well as the initial conditions. (Show 
the details of your work.) 

21. y" - 2y - 3y = 0. y(0) = 2, y'(0) = 14 

22. y" + 2 :>•' + y = 0, y(0) = 4, y'(0) = -6 

23. y" + 4y' + 5y = 0, y(0) = 2, )<'(()) = —5 

24. lOy" - 50y' + 65y = 0, y(0) = 1.5, y'(0) = 1.5 

25. y" + try' = 0, y(0) = 3, y'( 0) = — tr 

26. lOy" + 18y' + 5.6y = 0, y(0) = 4, y'(0) = -3.8 


27. 10y" + 5y' + 0.625y = 0, y(0) = 2, y'(0) = -4.5 

28. y" - 9y = 0, y(0) = -2, y'(0) = -12 

29. 20y" + 4y' + y = 0, y(0) = 3.2, y'(0) = 0 

30. y" + 2ky ' + (>fc 2 + <o 2 )y = 0, y(0) = 1, 
y'(0) = -k 

31. y" - 25y = 0, y(0) = 0, .v'(0) = 40 

32. y" - 2y' - 24y = 0, y(0) = 0, y'(0) = 20 

33. (Instability) Solve y" — y = 0 for the initial conditions 
y(0) = 1, y'(0) = — 1. Then change the initial conditions 
to y(0) = 1.001, y'(0) = —0.999 and explain why this 
small change of 0.001 at a = 0 causes a large change 
later, e.g., 22 at a* = 10. 

34. TEAM PROJECT. General Properties of Solutions 

(A) Coefficient formulas. Show how a and b in (1) 
can be expressed in terms of Aj and A 2 . Explain how 
these formulas can be used in constructing equations 
for given bases. 

(B) Root zero. Solve y" 4* 4y' = 0 (i) by the present 
method, and (ii) by reduction to first order. Can you 
explain why the result must be the same in both cases? 
Can you do the same for a general ODE y " 4 ay = 0? 

(C) Double root. Verify directly that xe Xx with 
A = —a! 2 is a solution of (1) in the case of a double 
root. Verify and explain why y = e~ 2x is a solution of 
y" — y — 6y = 0 but xe~ 2x is not. 

(D) Limits. Double roots should be limiting cases of 
distinct roots A^ A 2 as, say, A 2 — » A^ Experiment with 
this idea. (Remember I’HdpitaTs rule from calculus.) 
Can you arrive at xe XlXf ! Give it a try. 

35. (Verification) Show by substitution that y x in (S) is a 
solution of (1). 


2 .: Differential Operators. Optional 

This short section can be omitted without interrupting the flow of ideas; it will not be 
used in the sequel (except for the notations Dy, D 2 y , etc., for y \ y", etc.). 

Operational calculus means the technique and application of operators. Here, an 
operator is a transformation that transforms a function into another function. Hence 
differential calculus involves an operator, the differential operator £>, which transforms 
a (differentiable) function into its derivative. In operator notation we write 


( 1 ) 





60 


CHAP. 2 Second-Order Linear ODEs 


EXAMPLE 1 


Similarly, for the higher derivatives we write D 2 y = D(Dy) = y", and so on. For example, 
D sin = cos. D 2 sin = —sin, etc. 

For a homogeneous linear ODE y" 4- ciy 4- by = 0 with constant coefficients we can 
now introduce the second-order differential operator 

L = P(D) = D 2 + aD + bl, 

where I is the identity operator defined by Iy = y. Then we can write that ODE as 
(2) Ly = P(D)y = (D 2 + aD + bl)y = 0. 

P suggests “polynomial.” L is a linear operator. By definition this means that if Ly and 
Lw exist (this is the case if y and w are twice differentiable), then L(cy 4- few ) exists for 
any constants c and k , and 


L(cy 4- lew) = cLy 4- kLw. 

Let us show that from (2) we reach agreement with the results in Sec. 2.2. Since 
( De x )(x ) = \e Xx and (D 2 e x )( x) = X 2 e Ax , we obtain 

Le\x) = P(D)e\x) = (D 2 4- aD 4- bl)e\x) 

( 3 ) 

= (A 2 4- a\ + fye** = P( A)^ = 0. 

This confirms our result of Sec. 2.2 that e Ax is a solution of the ODE (2) if and only if X 
is a solution of the characteristic equation P( A) = 0. 

P( A) is a polynomial in the usual sense of algebra. If we replace A by the operator D, 
we obtain the “operator polynomial” P(D). The point of this operational calculus is that 
P(D) can be treated just like an algebraic quantity. In particular, we can factor it. 

Factorization, Solution of an ODE 

Factor P(D) = D 2 - 3D - 40/ and solve P(D)y = 0. 

Solution . D 2 - 3D - 40/ = (0 - 8 1)(D + 5/) because I 2 
solution yi = e Sx . Similarly, the solution of (D + 5f)y = 0 is y 2 
interval. From the factorization we obtain the ODE, as expected, 

(D - 8/)(D 4 - 5/)y - (D - 8/)(y / + 5v) = D(y* + 5v) - 8(y' + 5v) 

= y" + 5y ; - 8 / - 40y = v" - 3 y - 40y = 0. 

Verify that this agrees with the result of our method in Sec. 2.2. This is not unexpected because we factored 
P{D) in the same way as the characteristic polynomial P( A) = A 2 — 3 A — 40. M 


= /. Now ( D - 8/)y = y r - 8 y = 0 has the 
= e" 5a ‘. This is a basis of P(D)y = 0 on any 


It was essential that L in (2) has constant coefficients. Extension of operator methods to 
variable-coefficient ODEs is more difficult and will not be considered here. 

If operational methods were limited to the simple situations illustrated in this 
section, it would perhaps not be worth mentioning. Actually, the power of the operator 
approach appears in more complicated engineering problems, as we shall see in 
Chap. 6. 
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1-5 


APPLICATION OF DIFFERENTIAL 
OPERATORS 


Apply the given operator to the given functions. (Show all 
steps in detail.) 

1. (D — 7) 2 ; e x , xe x , sin x 

2. 8 D 2 + 2D — /; cosh |a\ sinh^v, e 


,.r/2 


0.4a: 


e 5 * sin .v. 


3. £> - 0.4/; 2a- 3 - i , 

4. (Z> + 5 1)(D - /); 

5. (D - 4/)(D + 3/); a 3 - 

6-13 1 GENERAL SOLUTION 


xe 


OAx 


sin 4.v, e 


-3a: 


Factor as in the text and solve. (Show the details.) 

6. (£> 2 - 5.5 D + 6.66f)y = 0 

7. ( D + 2/) 2 .v = 0 8. (D 2 - 0.49/)>- = 0 

9. (£> 2 + 60 + 13/)>- = 0 

10. (10 O 2 + 20 + 1 . 7 / )>’ = 0 


11. (O 2 + 4.10 + 3.l/).y = 0 

12. (40 2 + 4 itD + 7t 2 /)_v = 0 

13. (O 2 + 1 7.64<o 2 /)y = 0 

14. (Double root) If Z) 2 + «0 + bl has distinct roots 
fx and A, show that a particular solution is 
y = (e** — e^/ifx — A). Obtain from this a solution 
xe Ax by letting fx— > A and applying rHdpitaTs rule. 

15. (Linear operator) Illustrate the linearity of L in (2) by 
taking c = 4, k = — 6, y = e 2ac , and u* = cos 2*. 
Prove that L is linear. 

16. (Definition of linearity) Show that the definition of 
linearity in the text is equivalent to the following. If 
L[y] and L[w] exist, then L\y + w] exists and Hey) 
and L[kw] exist for all constants c and k y and 
L[y + w] = L[}’] + L[w] as well as L[cy] = cL\y\ and 
L[kw ] = kL[w]. 


2.4 Modeling: Free Oscillations 
(Mass-Spring System) 

Linear ODEs with constant coefficients have important applications in mechanics, as we 
show now (and in Sec. 2.8), and in electric circuits (to be shown in Sec. 2.9). In this section 
we consider a basic mechanical system, a mass on an elastic spring (“mass-spring system,” 
Fig. 32), which moves up and down. Its model will be a homogeneous linear ODE. 

Setting Up the Model 

We take an ordinary spring that resists compression as well extension and suspend it 
vertically from a fixed support, as shown in Fig. 32. At the lower end of the spring we 



motion 


(a) (b) (c) 

Fig. 32. Mechanical mass-spring system 




62 


CHAP. 2 Second-Order Linear ODEs 


attach a body of mass m. We assume m to be so large that we can neglect the mass of the 
spring. If we pull the body down a certain distance and then release it, it starts moving. 
We assume that it moves strictly vertically. 

How can we obtain the motion of the body, say, the displacement y(i) as function of 
time r? Now this motion is determined by Newton’s second law 

(1) Mass X Acceleration = my" = Force 

where y" = d 2 y/dt 2 and “Force” is the resultant of all the forces acting on the body. 
(For systems of units and conversion factors, see the inside of the front cover.) 

We choose the downward direction as the positive direction , thus regarding downward 
forces as positive and upward forces as negative. 

Consider Fig. 32. The spring is first unstretched. We now attach the body. This stretches 
the spring by an amount s 0 shown in the figure. It causes an upward force F 0 in the spring. 
Experiments show that F 0 is proportional to the stretch .s 0 , say, 

(2) F 0 = —ks 0 (Hooke’s law 2 ). 

k (> 0) is called the spring constant (or spring modulus). The minus sign indicates that 
F 0 points upward, in our negative direction. Stiff springs have large k . (Explain!) 

The extension s 0 is such that F 0 in the spring balances the weight W = mg of the 
body (where g = 980 cm/sec 2 = 32.17 ft/sec 2 is the gravitational constant). Hence 
F 0 + W = — ks 0 + mg = 0. These forces will not affect the motion. Spring and body are 
again at rest. This is called the static equilibrium of the system (Fig. 32b). We measure 
the displacement y(/) of the body from this ‘equilibrium point’ as the origin y = 0, 
downward positive and upward negative. 

From the position y = 0 we pull the body downward. This further stretches the spring 
by some amount y > 0 (the distance we pull it down). By Hooke’s law this causes an 
(additional) upward force F 1 in the spring, 

F\ = 

F x is a restoring force. It has the tendency to restore the system, that is, to pull the body 
back to y = 0. 

Undamped System: ODE and Solution 

Every system has damping — otherwise it would keep moving forever. But practically, the 
effect of damping may often be negligible, for example, for the motion of an iron ball on 
a spring during a few minutes. Then F x is the only force in (1) causing the motion. Hence 
(1) gives the model my" = —ky or 

(3) my" + ky = 0. 


2 ROBERT HOOKE (1635-1703), English physicist, a forerunner of Newton with respect to the law of 
gravitation. 
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By the method in Sec. 2.2 (see Example 6) we obtain as a general solution 
( 4 ) y(t) = A cos (o 0 t + B sin 6> 0 f, (o 0 = 

The corresponding motion is called a harmonic oscillation. 

Since the trigonometric functions in (4) have the period 27r/o> 0 , the body executes o> 0 /2tt 
cycles per second. This is the frequency of the oscillation, which is also called the natural 
frequency of the system. It is measured in cycles per second. Another name for cycles/sec 
is hertz (Hz). 3 

The sum in (4) can be combined into a phase-shifted cosine with amplitude C = V/\ 2 + B 2 
and phase angle 8 = arctan ( B/A ), 

( 4 *) y(t) = C cos (co 0 t - 8). 



To verify this, apply the addition formula for the cosine [(6) in App. 3.1] to (4*) and then 
compare with (4). Equation (4) is simpler in connection with initial value problems, 
whereas (4*) is physically more informative because it exhibits the amplitude and phase 
of the oscillation. 

Figure 33 shows typical forms of (4) and (4*), all corresponding to some positive initial 
displacement y( 0) [which determines A = y(0) in (4)] and different initial velocities y'( 0) 
[which determine B = y f (0)/a) o ]. 



® Positive 
©Zero 
® Negative 


Initial velocity 


Fig. 33. Harmonic oscillations 


Undamped Motion. Harmonic Oscillation 

If an iron ball of weight W = 98 nt (about 22 lb) stretches a spring 1.09 m (about 43 in.), how many cycles per 
minute will this mass-spring system execute? What will its motion be if wc pull down the weight an additional 
16 cm (about 6 in.) and let it start with zero initial velocity? 

Solution . Hooke’s law (2) with W as the force and 1 .09 meter as the stretch gives W = 1 .09 k: thus 
k = WV1.09 = 98/1.09 = 90 [kg/sec 2 ] = 90 [nt/meterj. The mass is m = Wig = 98/9.8 = 10 [kg]. This gives 
the frequency coq/(2tt) = \Ztfml{2ir) — 3/(2 tt) - 0.48 [Hz] = 29 [cycles/min]. 


3 HEINRICH HERTZ (1857-1894), German physicist, who discovered electromagnetic waves, as the basis 
of wireless communication developed by GUGLIELMO MARCONI (1874—1937). Italian physicist (Nobel prize 
in 1909). 
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From (4) and the initial conditions, y(0) = A = 0.16 [meter] and /(0) = (OqB = 0. Hence the motion is 

y(t) = 0. 1 6 cos 3/ [meter] or 0.52 cos 3/ [ft] (Fig. 34). 

If you have a chance of experimenting with a mass^-spring system, don’t miss it. You will be surprised about 
the good agreement between theory and experiment, usually within a fraction of one percent if you measure 
carefully. ■ 

y 

0.2 
0.1 
0 

- 0.1 
-0.2 

Fig. 34. Harmonic oscillation in Example 1 

Damped System: ODE and Solutions 

We now add a damping force 

F 2 = -cy' 

to our model my" = -ky, so that we have my" = —ky — cy f or 
(5) my" + cy' + ky = 0. 

Physically this can be done by connecting the body to a dashpot; see Fig. 35. We assume 
this new force to be proportional to the velocity y = dy/dt, as shown. This is generally 
a good approximation, at least for small velocities. 

c is called the damping constant We show that c is positive. If at some instant, y is 
positive, the body is moving downward (which is the positive direction). Hence the 
damping force F 2 = — cy', always acting against the direction of motion, must be an 
upward force, which means that it must be negative, F 2 = — cy ' < 0, so that — c < 0 and 
c > 0. For an upward motion, y r < 0 and we have a downward F 2 = — cy > 0; hence 
— c < 0 and c > 0, as before. 

The ODE (5) is homogeneous linear and has constant coefficients. Hence we can solve 
it by the method in Sec. 2.2. The characteristic equation is (divide (5) by m) 




Fig. 35. Damped system 
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By the usual formula for the roots of a quadratic equation we obtain, as in Sec. 2.2, 

(6) A x = —a. + j 3, Ao = ~ct — jg, where a = -7— and B = -7— Vc 2 — 4mk. 

2m 2m 

It is now most interesting that depending on the amount of damping (much, medium, or little) 
there will be three types of motion corresponding to the three Cases I, n, II in Sec. 2.2: 

Case I. c 2 > 4 mk. Distinct real roots A x , A 2 . (Overdamping) 

Case H c 2 = \mk. A real double root. (Critical damping) 

Case in. c 2 < 4 mk. Complex conjugate roots. (Underdamping) 

Discussion of the Three Cases 

Case I. Overdamping 

If the damping constant c is so large that c 2 > 4mk, then X 1 and A 2 are distinct real roots. 
In this case the corresponding general solution of (5) is 

(7) y(t) = c 1 e~ ( - a - 0)t + c 2 e~ ( “ +/3>t . 

We see that in this case, damping takes out energy so quickly that the body does not 
oscillate. For / > 0 both exponents in (7) are negative because a > 0, /3 > 0, and 
/3 2 = a 2 — k/m < a 2 . Hence both terms in (7) approach zero as t — » Practically 
speaking, after a sufficiently long time the mass will be at rest at the static equilibrium 
position (y = 0). Figure 36 shows (7) for some typical initial conditions. 

Case II. Critical Damping 

Critical damping is the border case between nonoscillatory motions (Case I) and oscillations 
(Case III). It occurs if the characteristic equation has a double root, that is, if c 2 = 4 mk, 



©Zero 
(D Negative 



Initial velocity 


Fig. 36. Typical motions (7) in the overdamped case 

[a] Positive initial displacement 

(b) Negative initial displacement 
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so that /3 = 0, A x = A 2 = —ot. Then the corresponding general solution of (5) is 

(8) y{t) = (ci + c 2 t)e~ ttt . 

This solution can pass through the equilibrium position y = 0 at most once because e~ at 
is never zero and Cj + c 2 t can have at most one positive zero. If both c x and c 2 are positive 
(or both negative), it has no positive zero, so that y does not pass through 0 at all. Figure 
37 shows typical forms of (8). Note that they look almost like those in the previous figure. 

Case III. Underdamping 

This is the most interesting case. It occurs if the damping constant c is so small that 
c 2 < 4 mk. Then j3 in (6) is no longer real but pure imaginary, say, 

(9) jS = ico* where = -- V 4 mk — c 2 = J — 7-5- (> 0). 

2m V m 4m 

(We write w* to reserve co for driving and electromotive forces in Secs. 2.8 and 2.9.) The 
roots of the characteristic equation are now complex conjugate, 

A x = —a + i(o*, A 2 = —a — ico* 

with a = c/(2m), as given in (6). Hence the corresponding general solution is 

(10) y(A) = e~ at (A cos (o*t + B sin <o*t) = Ce~ at cos (cu*r — 8) 

where C 2 = A 2 + B 2 and tan 8 = B/A , as in (4*). 

This represents damped oscillations. Their curve lies between the dashed curves 
y = Ce~ at and y = —Ce~ at in Fig. 38, touching them when a>*r - 8 is an integer multiple 
of 7T because these are the points at which cos (<u*r — 8) equals 1 or — 1. 

The frequency is <w*/(2tt ) Hz (hertz, cycles/sec). From (9) we see that the smaller c (> 0) 
is, the larger is a nd th e more rapid the oscillations become. If c approaches 0, then 
approaches co 0 = V k/m , giving the harmonic oscillation (4), whose frequency (o q I(2tt) is 
the natural frequency of the system. 



® Negative ' Fig. 38. Damped oscillation in 


Fig. 37. Critical damping [see (8)] Case 111 [see (10)] 
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EXAMPLE 2 The Three Cases of Damped Motion 

How does the motion in Example 1 change if we change the damping constant c to one of the following three 
values, with y(0) = 0.16 and y'(0) = 0 as before? 

(1) c = 100 kg/sec. (II) c = 60 kg/sec. (Ill) c - 10 kg/sec. 

Solution ♦ It is interesting to see how the behavior of the system changes due to the effect of die damping, 
which takes energy from the system, so that the oscillations decrease in amplitude (Case III) or even disappear 
(Cases II and 1). 

(I) With iti = 10 and k = 90, as in Example 1, the model is the initial value problem 

lOy" 4- 100/ 4- 90y = 0. y(0) = 0.16 [meter], y'( 0) = 0. 

The characteristic equation is 10A 2 4- 100A 4 90 = 10(A 4- 9)(A 4- 1) = 0. It has the roots -9 and - 1. This 
gives the general solution 

y = cie~ 9t 4* c 2 e~ l . We also need y' = — 9c\e~ 91 — c 2 e~ l . 

The initial conditions give Ci 4- c 2 = 0.16, —96^ - c 2 = 0. The solution is c\ = —0.02, c 2 = 0.18. Hence in 
the overdamped case the solution is 

y = -0.02<T 9£ 4- 0.l8e -t . 

It approaches 0 a s / — * ». The approach is rapid; after a few seconds the solution is practically 0, that is, the 
iron ball is at rest. 

(II) The model is as before, with c = 60 instead of 100. The characteristic equation now has the form 
10A 2 4- 60A 4- 90 = 10(A 4* 3) 2 = 0. It has the double root —3. Hence the corresponding general solution is 

y = (t'i 4- c 2 t)e~ zt . We also need / = (c 2 — 3c\ — 3c 2 t)e~ 3t . 

The initial conditions give y(0) = c 2 = 0.16, y'(0) = c 2 - 3cj = 0, c 2 = 0.48. Hence in the critical case the 
solution is 

y = (0.16 4- 0.48/)e“ 3t . 

It is always positive and decreases to 0 in a monotone fashion. 

(III) The model now is 10y" 4- 10/ 4- 90v = 0. Since c = 10 is smaller dian die cridcal c, we shall get 
oscillations. The characteristic equation is 10A 2 4- 10A 4- 90 = l()[(A + |) 2 4- 9 — £] = 0. It has the complex 
roots [see (4) in Sec. 2.2 with a = 1 and b = 9] 

A = -0.5 ± Vo.5 2 - 9 = -0.5 ± 2.96/. 

This gives the general solution 

y = e~ 0 5t (A cos 2.96/ 4- B sin 2.96/). 

Thus y(0) — A = 0.16. We also need the derivative 

y = f _o - 5t (-0.5/i cos 2.96/ - 0.5S sin 2.96/ - 2.964 sin 2.96/ + 2.96 B cos 2.96/). 

Hence y'(0) = -0.54 + 2.96B = 0, B = 0.54/2.96 = 0.027. This gives the solution 

v = <r o - 5t (0.16 cos 2.96/ + 0.027 sin 2.96/) = O.I62e -0 - 5 ‘ cos (2.96/ - 0.17). 

We see that these damped oscillations have a smaller frequency than the harmonic oscillations in Example 1 by 
about 1 % (since 2.96 is smaller than 3.00 by about 1 %). Their amplitude goes to zero. See Fig. 39. H 
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This section concerned free motions of mass-spring systems. Their models are 
homogeneous linear ODEs. Nonhomogeneous linear ODEs will arise as models of forced 
motions, that is, motions under the influence of a “driving force”. We shall study them 
in Sec. 2.8, after we have learned how to solve those ODEs. 



75] MOTION WITHOUT DAMPING 
(HARMONIC OSCILLATIONS) 

1. (Initial value problem) Find the harmonic motion (4) 
that starts from y 0 with initial velocity v Q . Graph or 
sketch the solutions for o) Q = tt\ y 0 = l, and various 
v 0 of your choice on common axes. At what t- values 
do all these curves intersect? Why? 

2. (Spring combinations) Find the frequency of vibration 
of a ball of mass m = 3 kg on a spring of modulus 
(i) k x = 27 nt/m, (ii) k 2 = 75 nt/m, (iii) on these springs 
in parallel (see Fig. 40), (iv) in series, that is, the ball hangs 
on one spring, which in turn hangs on the other spring. 

3. (Pendulum) Find the frequency of oscillation of a 
pendulum of length L (Fig. 41), neglecting air 
resistance and the weight of the rod, and assuming 6 
to be so small that sin 9 practically equals 9 . 

4. (Frequency) What is the frequency of a harmonic 
oscillation if the static equilibrium position of the ball 
is 10 cm lower than the lower end of the spring before 
the ball is attached? 

5. (Initial velocity) Could you make a harmonic oscillation 
move faster by giving the body a greater initial push? 

6. (Archimedian principle) This principle states that the 
buoyancy force equals the weight of the water 
displaced by the body (partly or totally submerged). 
The cylindrical buoy of diameter 60 cm in Fig. 42 is 
floating in water with its axis vertical. When depressed 
downward in the water and released, it vibrates with 
period 2 sec. What is its weight? 


Fig. 40. Parallel 
springs (Problem 2) 



Body of 
mass m 


Fig. 41. Pendulum 
(Problem 3) 




Fig. 42. Buoy (Problem 6) 


7. (F requency) How does the frequency of a harmonic 
motion change if we take (i) a spring of three times the 
modulus, (ii) a heavier ball? 

8. TEAM PROJECT. Harmonic Motions in Different 
Physical Systems. Different physical or other systems 
may have the same or similar models, thus showing the 
unifying power of mathematical methods. Illustrate 
this for the systems in Figs. 43-45. 

(a) Flat spring (Fig. 43). The spring is horizontally 
clamped at one end, and a body of weight 25 nt (about 
5.6 lb) is attached at the other end. Find the motion of 
the system, assuming that its static equilibrium is 2 cm 
below the horizontal line, we let the system start from 
this position with initial velocity 15 cm/sec, and 
damping is negligible. 

(b) Torsional vibrations (Fig. 44). Undamped 
torsional vibrations (rotations back and forth) of a wheel 
attached to an elastic thin rod are modeled by the ODE 
I 0 9" -r K9 = 0, where 6 is the angle measured from the 
state of equilibrium, 7 0 is the polar moment of inertia of 
the wheel about its center, and K is the torsional stiffness 
of the rod. Solve this ODE for K/l 0 = 17.64 sec -2 , initial 
angle 45°, and initial angular velocity 15° sec -1 . 

(c) Water in a tube (Fig. 45). What is the frequency 
of vibration of 5 liters of water (about 1.3 gal) in a 
U-shaped tube of diameter 4 cm, neglecting friction? 



Fig. 43. Flat spring (Project 8a) 



Fig. 44. Torsional 
vibrations (Project 8b) 



Fig. 45. Tube (Project 8c) 


|»-17| DAMPED MOTION 

9. (Frequency) Find an approximation formula for a>* in 
terms of co 0 by applying the binomial theorem in (9) 
and retaining only the first two terms. How good is the 
approximation in Example 2, III? 
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10* (Extrema) Find the location of the maxima and 
minima of y = e“ 2f cos t obtained approximately from 
a graph of y and compare it with the exact values 
obtained by calculation. 

11. (Maxima) Show that the maxima of an underdamped 
motion occur at equidistant /-values and find the 
distance. 

12. (Logarithmic decrement) Show that the ratio of two 
consecutive maximum amplitudes of a damped oscillation 
(10) is constant, and the natural logarithm of this ratio, 
called the logarithmic decrement \ equals A = IttccIoj*. 
Find A for the solutions of y" + 2 v' + 5 y = 0 . 

13. (Shock absorber) What is the smallest value of the 
damping constant of a shock absorber in the suspension 
of a wheel of a car (consisting of a spring and an absorber) 
that will provide (theoretically) an oscillation-free ride 
if the mass of the car is 2000 kg and the spring constant 
equals 4500 kg/sec 2 ? 

14. (Damping constant) Consider an underdamped 
motion of a body of mass m = 2 kg. If the time 
between two consecutive maxima is 2 sec and the 
maximum amplitude decreases to 5 of its initial value 
after 15 cycles, what is the damping constant of the 
system? 

15. (Initial value problem) Find the critical motion (8) 
that starts from y 0 with initial velocity t; 0 . Graph 
solution curves for a = 1 , y 0 = 1 and several u 0 suc ^ 
that (i) the curve does not intersect the /-axis, (ii) it 
intersects it at t = 1 , 2 , * • • , 5 , respectively. 

16. (Initial value problem) Find the overdamped motion 
(7) that starts from y 0 with initial velocity v 0 . 

17. (Overdamping) Show that in the overdamped case, the 
body can pass through y = 0 at most once. 

18. CAS PROJECT. Transition Between Cases I, II, m. 
Study this transition in terms of graphs of typical 
solutions. (Cf. Fig, 46 .) 


(a) Avoiding unnecessary generality is part of good 
modeling . Decide that the initial value problems (A) 
and (B), 

(A) y” + cy 9 + y = 0 , y( 0 ) = 1 , /( 0 ) = 0 

(B) the same with different c and y f ( 0 ) = -2 (instead 
of 0), will give practically as much information as a 
problem with other m 9 k, y( 0), > ,; (0). 

(b) Consider (A). Choose suitable values of c, perhaps 
better ones than in Fig. 46 for the transition from Case 
III to II and I. Guess c for the curves in the figure. 

(c) Time to go to rest. Theoretically, this time is 
infinite (why?). Practically, the system is at rest when 
its motion has become very small, say, less than 0.1% 
of the initial displacement (this choice being up to us), 
that is in our case, 

(11) \y(0\ < 0.001 for all t greater than some t v 

In engineering constructions, damping can often be varied 
without too much trouble. Experimenting with your 
graphs, find empirically a relation between / x and c. 

(d) Solve (A) analytically . Give a reason why the 
solution c of y(r 2 ) = -0.001, with t 2 the solution of 
y' (/) = 0, will give you the best possible c satisfying (11). 

(e) Consider (B) empirically as in (a) and (b). What 
is the main difference between (B) and (A)? 



2.5 Euler-Cauchy Equations 

Euler-Cauchy equations 4 are ODEs of the form 
( 1 ) x 2 y ,f + axy 9 4 - by = 0 


Leonhard EULER (1 707-1783) was an enormously creative Swiss mathematician. He made fundamental 
contributions to almost all branches of mathematics and its application to physics. His important books on algebra 
and calculus contain numerous basic results of his own research. The great French mathematician AUGUSTIN 
LOUIS CAUCHY (1789-1857) is the father of modem analysis. He is the creator of complex analysis and had 
great influence on ODEs, PDEs. infinite series, elasticity theory, and optics. 
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EXAMPLE 1 


with given constants a and b and unknown y(x). We substitute 

(2) y = a *” 1 

and its derivatives y = mx 7n ~ A and y" = m(m — l)A m ~ 2 into (1). This gives 
x 2 m(m - l)A m “ 2 + axmx™- 1 + bx m = 0. 

We now see that (2) was a rather natural choice because we have obtained a common 
factor x m . Dropping it, we have the auxiliary equation m(m — 1) + am + b = 0 or 

(3) m 2 4- (a — 1 )m + b = 0. (Note: a — 1, not a.) 


Hence y = x m is a solution of (1) if and only if m is a root of (3). The roots of (3) are 

(4) m ! = |(] - a) + VlC 1 ~ of ~ b, m 2 = |(] - a) - \/^(\ - a) 2 - b. 
Case I. If the roots and m 2 are real and different, then solutions are 

.Vi(a) = A mi and y 2 (x) — x™ 2 

They are linearly independent since their quotient is not constant. Hence they constitute 
a basis of solutions of (1) for all a for which they are real. The corresponding general 
solution for all these a- is 

(5) y = Cx a Wi + c 2 x mz (c x , c 2 arbitrary). 


General Solution in the Case of Different Real Roots 


The Euler-Cauchy equation 


has the auxiliary equation 


A-y + l.5.vv' - 0.5v = 0 


m 2 + 0.5/m - 0.5 = 0. 


The roots are 0.5 and — I. Hence a basis of solutions for all positive a* is y 2 
general solution 

v = C]Vx + — 

.V 


(Note: 0.5, not 1.5!) 
a 0,5 and y 2 = l/.v and gives the 

(a- > 0). ■ 


Case II. Equation (4) shows that the auxiliary equation (3) has a double root 
/72 1 =|(1 — a) if and only if (1 - a) 2 - 4b = 0. The Euler-Cauchy equation (1) then 
has the form 

(6) a 2 y" + axy f + J(1 — a) 2 y = 0. 


A solution is y x = * (l “ a)/2 . To obtain a second linearly independent solution, we apply 
the method of reduction of order from Sec. 2.1 as follows. Starting from y 2 = uy ly we 
obtain for u the expression (9), Sec. 2.1, namely. 



where 
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Here it is crucial that p is taken from the ODE written in standard form, in our case, 


(6*) 



(1 ~ a ) 2 

4a* 2 


y = o. 


This shows that p = atx (not ax). Hence its integral is a In a* = In (A a ), the exponential 
function in U is 1/A* a , and division by y t 2 = x 1 " a gives U = 1/a, and u = In x by integration. 

Thus, in this “critical case,” a basis of solutions for positive x is y 1 = x m and 
y 2 = x m In a, where m = |(1 — a). Linear independence follows from the fact that the 
quotient of these solutions is not constant. Hence, for all x for which and y 2 are defined 
and real, a general solution is 


( 7 ) 


y = (ci + c 2 In a) A m , 


m = g(l - a ). 


EXAMPLE 2 General Solution in the Case of a Double Root 

The Euler-Cauchy equation x 2 y" - 5xy' -f 9y = 0 has the auxiliary equation m 2 - 6 m 4- 9 = 0. It has the 
double root m = 3* so that a general solution for all positive x is 

y = (ci 4- c 2 In a) a 3 . ■ 

Case III. The case of complex roots is of minor practical importance, and it suffices to 
present an example that explains the derivation of real solutions from complex ones. 

EXAMPLE 3 Real General Solution in the Case of Complex Roots 

The Euler-Cauchy equation 

x 2 y" 4- O.&v/ + 16.04)* = 0 

has the auxiliary equation m 2 — OAm 4- 16.04 = 0. The roots are complex conjugate, m 1 = 0.2 4- 4/ and 
m 2 = 0.2 - 4 1 \ where i = V^T. (We know from algebra that if a polynomial with real coefficients has complex 
roots, these are always conjugate.) Now use the trick of writing x — e ln x and obtain 

V »U _ v 0.2+4i _ v 0.2^ln .X‘^4? _ v 0.2^(4 In .t) i 
V W 2 — v 0.2-4i _ ^0.2^ln A^-4i _ ^.0.2^-(4 In x)i 

Next apply Euler's formula (11) in Sec. 2.2 with / = 4 In x to these two formulas. This gives 

v m, = A .0.2 fcos (4 , n x y + j sjn (4 ln A) -| t 

.v'” 2 = ,v 0,2 [cos (4 ln .v) — r sin (4 In .v)]. 

Add these two formulas, so that the sine drops out, and divide the result by 2. Then subtract the second formula 
from the first, so that the cosine drops out, and divide the result by 2/. This yields 

.v 0 - 2 cos (4 ln x) and .v 0 2 sin (4 ln .v) 

respectively. By the superposition principle in Sec. 2.2 these are solutions of die Euler-Cauchy equation (1). 
Since their quotient cot (4 in .v) is not constant, they are linearly independent. Hence they form a basis of solutions, 
and the corresponding real general solution for all positive .v is 

(8) y = x°' 2 [A cos (4 In x ) 4- B sin (4 In a)]. 

Figure 47 shows typical solution curves in the three cases discussed, in particular the basis fiinctions in 
Examples 1 and 3. ■ 
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Case I: Real roots CaseU: Double root Case III: Complex roots 

Fig. 47. Euler-Cauchy equations 


EXAMPLE 4 Boundary Value Problem. Electric Potential Field Between Two Concentric Spheres 

Find the electrostatic potential v = v(r) between two concentric spheres of radii r x = 5 cm and r 2 = 10 cm 
kept at potentials y* = 110 V and y 2 = 0 t respectively. 

Physical Information. v{r) is a solution of the Euler-Cauchy equation ru" + 2u f = 0, where v* = du/dr. 

Solution . The auxiliary equation is m 2 + m = 0. It has the roots 0 and —1. This gives the general solution 
y(r) = + c 2 /r. From the “boundary conditions” (the potentials on the spheres) we obtain 

i>(5) = c t + y = 110. o(10) = c i + 75 - = 0- 

By subtraction. c 2 /10 = 110, c 2 = 1100. From the second equation, ci = — c 2 /10 = -110. Answer: 
y(r) = —110+ 1 100/r V. Figure 48 shows that the potential is not a straight line, as it would be for a potential 
between two parallel plates. For example, on the sphere of radius 7.5 cm it is not 1 10/2 = 55 V, but considerably 
less. (What is it?) ■ 



Fig. 48. Potential v(r) in Example 4 


HESEBEEfit 


1-10| GENERAL SOLUTION 

Find a real general solution, showing the details of your 
work. 

1. x 2 y" — 6y = 0 2. 4* 2 y" + 4.ry ' - y = 0 

3. x 2 y" - Ixy' + 16y = 0 

4 . x 2 y" + 3 xy' + y = 0 5. x 2 y" - xy' + 2y = 0 

6 . 2x 2 y" + 4 xy' + Sy = 0 

7. (10a- 2 D 2 - 20 xD + 22AI)y = 0 

8. (4a- 2 D 2 + l)y = 0 9. (100* 2 Z) 2 + 9/)}- = 0 

10. (10 x 2 D 2 + 6xD + 0.5l)y * 0 


11-IS| INITIAL VALUE PROBLEM 

Solve and graph the solution, showing the details of your 
work. 

11. x 2 y" - 4 xy' + 6y = 0. y(l) = l, >-'(1) = 0 

12. x 2 y" + 3xy' + y = 0. y(l) = 4, /(l) = -2 

13. ( x z D 2 + 2 xD + 100.25 l)y = 0, >>(1) = 2. 

/(I) = -11 

14. (x : 2 D 2 -2 xD + 2.25 l)y = 0. y(l) = 2.2, 

/(l) = 2.5 

15. (xD 2 + 4 D)y = 0. *1) = 12, /(l) = -6 
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16. TEAM PROJECT. Double Root 

(A) Derive a second linearly independent solution of 
(1) by reduction of order; but instead of using (9), Sec. 
2.1. perform all steps directly for the present ODE (1 ). 

(B) Obtain x m In x by considering the solutions x m and 
x m+s 0 f a su jt a bie Euler-Cauchy equation and letting 
. 9 — » 0. 


(C) Verify by substitution that x m In x, m = (1 — a) 12, 
is a solution in the critical case. 

(D) Transform the Euler-Cauchy equation (1) into an 
ODE with constant coefficients by setting x = e l (x > 0). 

(E) Obtain a second linearly independent solution of 
the Euler-Cauchy equation in the “critical case” from 
that of a constant-coefficient ODE. 


2.6 Existence and Uniqueness of Solutions. 
Wronskian 

In this section we shall discuss the general theory of homogeneous linear ODEs 

(1) y* + p(x)y' + qix)y = 0 

with continuous, but otherwise arbitrary variable coefficients p and q. This will concern 
the existence and form of a general solution of (1) as well as the uniqueness of the solution 
of initial value problems consisting of such an ODE and two initial conditions 

(2) y(x 0 ) = K 0i y'(x 0 ) = Ki 
with given * 0 , and Ki- 

The two main results will be Theorem 1, stating that such an initial value problem 
always has a solution which is unique, and Theorem 4, stating that a general solution 

(3) y = + c 2 y 2 (c a , c 2 arbitrary) 

includes all solutions. Hence linear ODEs with continuous coefficients have no “singular 
solutions ” (solutions not obtainable from a general solution). 

Clearly, no such theory was needed for constant-coefficient or Euler-Cauchy equations 
because everything resulted explicitly from our calculations. 

Central to our present discussion is the following theorem. 


THEOREM 1 


Existence and Uniqueness Theorem for Initial Value Problems 

If p(x) and q{x) are continuous functions on some open interval I (see Sec. 1.1) and 
x 0 is in /, then the initial value problem consisting of (1) and (2) has a unique 
solution y(x) on the interval I. 


The proof of existence uses the same prerequisites as the existence proof in Sec. 1.7 
and will not be presented here; it can be found in Ref. [A1 1] listed in App. 1. Uniqueness 
proofs are usually simpler than existence proofs. But for Theorem 1, even the uniqueness 
proof is long, and we give it as an additional proof in App. 4. 
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THEOREM 2 


PROOF 


Linear Independence of Solutions 

Remember from Sec. 2.1 that a general solution on an open interval / is made up from a 
basis Vx, y 2 on /, that is, from a pair of linearly independent solutions on /. Here we call 
y u y 2 linearly independent on / if the equation 

(4) &i3TU') -f k 2 y 2 (x) = 0 on / implies = 0, = 0. 

We call _v x , y 2 linearly dependent on 1 if this equation also holds for constants k l9 k 2 
not both 0. In this case, and only in this case, y x and y 2 are proportional on /, that is (see 
Sec. 2.1), 

(5) (a) y x = ky 2 or (b) y 2 = ly x for all x on /. 

For our discussion the following criterion of linear independence and dependence of 
solutions will be helpful. 


Linear Dependence and Independence of Solutions 

Let the ODE ( 1 ) have continuous coefficients pQc) and q(x) on an open interval 1. 
Then two solutions y x and y 2 of{\) on I are linearly dependent on I if and only if 
their “Wronskian” 

(6) W(.vi, 3’ss) = yiyk - y 2 y'i 

is 0 at some a * 0 in /. Furthermore, ifW= Oat an x = x 0 in /, then W = 0 on I: hence 
if there is an at in I at which W is not 0, then y lf y 2 are linearly independent on 1. 


(a) Let y x and y 2 be linearly dependent on I. Then (5a) or (5b) holds on /. If (5a) holds, then 
Mji, y 2 ) = 3’i.V2 - y 2 y 1 = ky 2 y 2 - y 2 kyz = 0. 

Similarly if (5b) holds. 

(b) Conversely, we let W(vi, y 2 ) = 0 for some x = x 0 and show that this implies linear 
dependence of Vi and y 2 on I. We consider the linear system of equations in the unknowns 
ki, k 2 

kiyi(x 0 ) + k 2 y z (x 0 ) = 0 
(7) 

kiy[(x 0 ) + = 0. 

To eliminate k 2 , multiply the first equation by y 2 and the second by —y 2 and add the 
resulting equations. This gives 


kiyi(x 0 )y 2 (,*o) ~ = WyiOto), .V 2 (-«o)) = 0. 


Similarly, to eliminate k 1> multiply the first equation by —y[ and the second by y x and 
add the resulting equations. This gives 


kzWiy^Xo), y 2 (x 0 )) = 0. 
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EXAMPLE 1 


EXAMPLE 2 


If W were not 0 at x 0 , we could divide by W and conclude that k x = k 2 = 0. Since W is 
0, division is not possible, and the system has a solution for which k x and k 2 are not both 
0. Using these numbers k lf k 2 , we introduce the function 

y(x) = ki>\(x) + k 2 y 2 {x). 

Since (1) is homogeneous linear, Fundamental Theorem 1 in Sec. 2.1 (the superposition 
principle) implies that this function is a solution of ( 1 ) on /. From (7) we see that it satisfies 
the initial conditions y(A* 0 ) = 0, y f ( x 0 ) = 0. Now another solution of (1) satisfying the 
same initial conditions is y* = 0. Since the coefficients p and q of (1) are continuous. 
Theorem 1 applies and gives uniqueness, that is, y = y*, written out 

+ ^ 2>2 = 0 on /. 

Now since k t and k 2 are not both zero, this means linear dependence of y l9 y 2 on /. 

(c) We prove the last statement of the theorem. If WCv 0 ) = 0 at an x 0 in /, we have 
linear dependence of y ls y 2 on / by part (b), hence W = 0 by part (a) of this proof. Hence 
in the case of linear dependence it cannot happen that Wfa) ^ 0 at an x A in /. If it does 
happen, it thus implies linear independence as claimed. ■ 

Remark. Determinants. Students familiar with second-order determinants may have 
noticed that 


W()’i> )’ 2 ) 


yi v 2 


y'i 


y f 2 


= yiyL - yzyi 


This determinant is called the Wronski determinant 5 or, briefly, the Wronskian, of two 
solutions Vj and y 2 of (l), as has already been mentioned in (6). Note that its four entries 
occupy the same positions as in the linear system (7). 


Illustration of Theorem 2 

The functions y 1 = cos cox and v 2 = sin tox are solutions of y" + co 2 v = 0. Their Wronskian is 


W(cos cox, sin w.v) 


cos (OX 
— to sin (ox 


sin cox 
(O cos tt.V 


= v lt >’ 2 ~ V 2 .V 1 = (o cos 2 cox + to sin 2 (ox = to. 


Theorem 2 shows that these solutions arc linearly independent if and only if to ^ 0. Of course, we can see 
this directly from the quotient y 2 /yi = tan au\ For w = Owe have y 2 — 0. which implies linear dependence 
(why?). ■ 


Illustration of Theorem 2 for a Double Root 

A general solution of y" — 2 y + v = 0 on any interval is y = (c x + c 2 x)e x . (Verify!). The corresponding 
Wronskian is not 0. which shows linear independence of e x and xe x on any interval. Namely, 


W(x, xe x ) = 


C x + \)e* 


= (.v + \)e 


2* T - xe 2x = # 0 . 


’Introduced by WRONSKI (JOSEF MARIA HONE. 1776-1853). Polish mathematician. 
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THEOREM 3 


PROOF 


THEOREM 4 


PROOF 


A General Solution of (1) Includes All Solutions 

This will be our second main result, as announced at die beginning. Let us start with existence. 


Existence of a General Solution 

Ifp(x) and q(x) are continuous on an open internal /, then (1) has a general solution 
on L 


By Theorem 1, the ODE (1) has a solution y x (jc) on I satisfying die initial conditions 

.ViO'o) = 1, y'i(x 0 ) = 0 

and a solution y 2 (x) on I satisfying the initial conditions 


yz&o) = 0, j2(*o) = 1* 

The Wronskian of these two solutions has at x = a 0 the value 

W0'i(0), y 2 (0)) = yi(* 0 ):v2(*o) - 3 ? 2(*o).v((* 0 ) = i. 

Hence, by Theorem 2, these solutions are linearly independent on /. They form a basis of 
solutions of (1) on /, and y = + c 2 y 2 with arbitrary c lt c 2 is a general solution of (1) 

on /, whose existence we wanted to prove. ■ 

We finally show that a general solution is as general as it can possibly be. 


A General Solution Includes All Solutions 

If the ODE (1) has continuous coefficients p(x) and q(x) on some open internal I, 
then every solution y = 7(a) 0/(1) on I is of the form 

(8) 7(a) = C iyi (x) + C 2 y 2 ( x) 

where y 2 is any basis of solutions of (l) on I and C v C 2 are suitable constants . 

Hence ( 1 ) does not have singular solutions ( that is, solutions not obtainable from 
a general solution l 


Let y = 7(a) be any solution of (1) on /. Now, by Theorem 3 the ODE (1) has a general 
solution 

(9) y(x) = CiJxW + c 2 y 2 (x) 

on I. We have to find suitable values of c ls c 2 such that j(a) = 7(a) on /. We choose any 
x o in I and show first that we can find values of c lf c 2 such that we reach agreement at 
a 0 , that is, y(A 0 ) = 7(a 0 ) and >’'(a 0 ) = 7'(a 0 ). Written out in terms of (9), this becomes 


( 10 ) 


(a) Ciy^Ao) + c 2 y 2 (x 0 ) = 7(a 0 ) 

(b) Ciyi(A 0 ) + c 2 y 2 {X(f) = 7/a 0 ). 
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We determine the unknowns c x and c 2 . To eliminate c 2 , we multiply (10a) by y 2 (A* 0 ) and 
(10b) by — v 2 (a' 0 ) and add the resulting equations. This gives an equation for c v Then we 
multiply (10a) by — yj[(A' 0 ) and (10b) by .Vi(a 0 ) and add the resulting equations. This gives 
an equation for c 2 . These new equations are as follows, where we take the values of y 1? 
yi y 2, yL T, Y l at a 0 , 


ci(yiyL - yzy'i) = ciWO’i. yz) = Yy'z - y^Y' 

c 2 (.ViV2 - y^y'i) = c 2 W(y ly y 2 ) = - Yy[. 


Since y 1? y 2 is a basis, the Wronskian W in these equations is not 0, and we can solve for 
c x and c 2 . We call the (unique) solution c x = C x , c 2 = C 2 . By substituting it into (9) we 
obtain from (9) the particular solution 

y*(x) = Ctfrf x) + C 2 y 2 (x). 

Now since C lf C 2 is a solution of (10), we see from (10) that 
v*(a 0 ) = y(A 0 ), y*'( X q) = Y f (x 0 ). 

From the uniqueness stated in Theorem 1 this implies that y* and Y must be equal 
everywhere on /, and the proof is complete. ■ 

Looking back at he content of this section, we see that homogeneous linear ODEs with 
continuous variable coefficients have a conceptually and structurally rather transparent 
existence and uniqueness theory of solutions. Important in itself, this theory will also 
provide the foundation of an investigation of nonhomogeneous linear ODEs, whose theory 
and engineering applications we shall study in the remaining four sections of this chapter. 



1 1-17] BASES OF SOLUTIONS. 

CORRESPONDING ODEs. WRONSKIANS 

Find an ODE (1) for which the given functions are 
solutions. Show linear independence (a) by considering 
quotients, (b) by Theorem 2. 

1. e°’ 5x , e~°’ 5x 2. cos tta, sin ttx 

3. e kx , xe kx 4. * 3 , a*’ 2 

5. a 0 * 25 , a 0 25 In a 6. <? 3 - 4a \ 

7. cos (2 In a), sin (2 In a) 

8 . e~ 2x , xe~ 2x 9 . a 15 , a' 0 * 5 

10. a“ 3 , x“ 3 In a 11. cosh 2.5a, sinh 2.5a 

12. e~ 2x cos o)x y e~ 2x sin cox 

13. e~ x cos 0.8a, e~ x sin 0.8a 

14. a” 1 cos (In a), a -1 sin (In a) 

15. e ~ 2 ‘ 5x cos 0.3a. e ~ 2 5x sin 0.3 a 

16. e~ kx cos 7ta, e~ kx sin ?ta 

17. xe~ 2 Birx 


18. TEAM PROJECT. Consequences of the Present 
Theory. This concerns some noteworthy general 
properties of solutions. Assume that the coefficients p 
and q of the ODE (1) are continuous on some open 
interval /, to which the subsequent statements refer. 

(A) Solve y " — y = 0 (a) by exponential functions, 
(b) by hyperbolic functions. How are the constants in 
the corresponding general solutions related? 

(B) Prove that the solutions of a basis cannot be 0 at 
the same point. 

(C) Prove that the solutions of a basis cannot have a 
maximum or minimum at the same point. 

(D) Express (y 2 /> , i) / by a formula involving the 
Wronskian W . Why is it likely that such a formula 
should exist? Use it to find W in Prob. 10. 

(E) Sketch y x ( a) = a 3 if a ^ 0 and 0 if a < 0, 
y 2 (x) = 0 if a i= 0 and a 3 if a < 0. Show linear 
independence on -1 < a < 1. What is their 
Wronskian? What Euler-Cauchy equation do y lt y 2 
satisfy? Is there a contradiction to Theorem 2? 
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(F) Prove Abel’s formula 6 where c = M v^Ao). y 2 (.v 0 )). Apply it to Prob. 12. Hint: 

Write (1) for y! and for y 2 . Eliminate q algebraically 
from these two ODEs, obtaining a first-order linear 
ODE. Solve it. 


W(vi(a-), y 2 ( a)) = c exp 


- | pit) clt 

- 


2.7 Nonhomogeneous ODEs 

Method of Undetermined Coefficients 

In this section we proceed from homogeneous to nonhomogeneous linear ODEs 

(1) y" + p(x)v' + q(x)y = r{x) 

where r(x) ^ 0. We shall see that a “general solution” of (1) is the sum of a general 
solution of the corresponding homogeneous ODE 

(2) y" + p(x)y' + q(x)y = 0 

and a “particular solution” of (1). These two new terms “general solution of (1)” and 
“particular solution of (1)” are defined as follows. 


DEFINITION 


General Solution, Particular Solution 

A general solution of the nonhomogeneous ODE (1) on an open interval 7 is a 
solution of the form 

(3) y(x) = y h (x) + y p (x); 

here, y h = Ci}\ + c 2 .y 2 is a general solution of the homogeneous ODE (2) on 7 and 
y p is any solution of (1) on 7 containing no arbitrary constants. 

A particular solution of (1) on 7 is a solution obtained from (3) by assigning 
specific values to the arbitrary constants and c 2 in y h . 


Our task is now twofold, first to justify these definitions and then to develop a method 
for finding a solution y p of (1). 

Accordingly, we first show that a general solution as just defined satisfies (1) and that 
the solutions of (1) and (2) are related in a very simple way. 


THEOREM 1 


Relations of Solutions of (1) to Those of (2) 

(a) The sum of a solution y of ( 1 ) on some open interval J ancl a solution y of 
(2) on I is a solution of ( 1) on 7. In particular, (3) is a solution of { 1) on 7. 

(b) The difference of two solutions of ( 1) on 7 is a solution of (2) on 7. 


6 NIELS HENRIK ABEL (1802-1829). Norwegian mathematician. 
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PROOF (a) Let L[y] denote the left side of (I). Then for any solutions y of (1) and y of (2) on I, 

L[y + y] = L[y] + L[y] = r + 0 = /*. 


(b) For any solutions y and y* of (1) on / we have L[y — y*] = L[yJ — L[y*] = /*-/* = 0. 

■ 

Now for homogeneous ODEs (2) we know that general solutions include all solutions. 
We show that the same is true for nonhomogeneous ODEs (1). 


THEOREM 2 


A General Solution of a Nonhomogeneous ODE Includes All Solutions 

If the coefficients p(x), q(x\ and the function r(x) in (1) are continuous on some 
open interval l, then every solution of (1) on l is obtained by assigning suitable 
values to the arbitrary constants c x and c 2 in a general solution (3) of (l) on 1. 


PROOF Let y* be any solution of ( 1 ) on / and x 0 any a* in /. Let (3) be any general solution of (1) 
on I. This solution exists. Indeed, y h = Cj Vj + c 2 > r 2 exists by Theorem 3 in Sec. 2.6 
because of the continuity assumption, and y p exists according to a construction to be shown 
in Sec. 2.10. Now, by Theorem 1(b) just proved, the difference Y = y* — y p is a solution 
of (2) on 7. At A' 0 we have 


Y(x 0 ) = .y*(.* 0 ) - .v p (a- 0 ), Y'(x o) = y*'U 0 ) - ?p(x 0 ). 

Theorem 1 in Sec. 2.6 implies that for these conditions, as for any other initial conditions 
in /, there exists a unique particular solution of (2) obtained by assigning suitable values 
to c„ c 2 in y, r From this and y* = Y + y v the statement follows. ■ 

Method of Undetermined Coefficients 

Our discussion suggests the following. To solve the nonhomogeneous ODE ( 1 ) or an initial 
value problem for (1), we have to solve the homogeneous ODE (2) and find any solution 
y p 0/(1), so that we obtain a general solution (3) of (1). 

How can we find a solution y p of (1)? One method is the so-called method of 
undetermined coefficients. It is much simpler than another, more general method (to be 
discussed in Sec. 2.10). Since it applies to models of vibrational systems and electric 
circuits to be shown in the next two sections, it is frequently used in engineering. 

More precisely, the method of undetermined coefficients is suitable for linear ODEs 
with constant coefficients a and b 

(4) y" + ay 1 + by = r(x) 

when r(x) is an exponential function, a power of .v, a cosine or sine, or sums or products 
of such functions. These functions have derivatives similar to r(x) itself. This gives the 
idea. We choose a form for y p similar to /*(a*), but with unknown coefficients to be 
determined by substituting that y p and its derivatives into the ODE. Table 2.1 on p. 80 
shows the choice of y p for practically important forms of >*(a;). Corresponding rules are 
as follows. 
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EXAMPLE 1 


Choice Rules for the Method of Undetermined Coefficients 

(a) Basic Rule. If r(x) in (4) is one of the functions in the first column in 
Table 2.1, choose y p in the same line and determine its undetermined 
coefficients by substituting y p and its derivatives into (4). 

(b) Modification Rule. If a term in your choice for y p happens to be a 
solution of the homogeneous ODE corresponding to (4), multiply your 
choice of y p by x (or by x 2 if this solution corresponds to a double root of 
the characteristic equation of the homogeneous ODE). 

(c) Sum Rule. If r(x) is a sum of functions in the first column of Table 2.1, 
choose for y p the sum of the functions in the corresponding lines of the 
second column. 


The Basic Rule applies when r^v) is a single term. The Modification Rule helps in the 
indicated case, and to recognize such a case, we have to solve the homogeneous ODE 
first. The Sum Rule follows by noting that the sum of two solutions of (1) with r = i\ 
and r = ;* 2 (and the same left side!) is a solution of (1) with /• = r x 4 r 2 . (Verify!) 

The method is self-correcting. A false choice for y p or one with too few terms will lead 
to a contradiction. A choice with too many terms will give a correct result, with superfluous 
coefficients coming out zero. 

Let us illustrate Rules (a)-(c) by the typical Examples 1-3. 

Table 2.1 Method of Undetermined Coefficients 


Term in r(x) 

Choice for j p (a) 

ke yx 

Ce yx 

kx n (n = 0, 1, • • •) 

K n x n + K n _! a-”" 1 + • • • + Kjx + K 0 

k cos cox 
k sin cox 

1 

J 

K cos cox + M sin cox 

ke nX cos cox 
ke aX sin cox 

1 

J 

e aX (K cos cox 4- M sin tax) 


Application of the Basic Rule (a) 

Solve the initial value problem 

(5) v" + v = O.OOLv 2 , y( 0) = 0, y'(0) = 1.5. 

Solution. Step 1. General solution of the homogeneous ODE. The ODE y" + y = 0 has the general solution 

y h = A cos .v + B sin .v. 

Step 2. Solution y p of the nonhomogeneous ODE. We first try y p - Kx 2 . Then y p = 2 K. By substitution, 
2 K + Kx 2 = 0.00 l.v 2 . For this to hold for all a*, the coefficient of each power of a* (a 2 and a 0 ) must be the same 
on both sides; thus K = 0.001 and 2K = 0, a contradiction. 

The second line in Table 2. 1 suggests the choice 

v p = K 2 .x 2 + K x x + K 0 . Then y' p + y p = 2K Z + K z x 2 + K x x + K 0 = 0.00 l.v 2 . 

Equating the coefficients of .v 2 , x. x° on both sides, we have K z = 0.00 1. K x = 0. 2 K z + K 0 = 0. Hence 
K o = -2K 2 = -0.002. This gives v p = O.OOl.v 2 - 0.002. and 

.v = y h + v p = A cos.v + B sin .v + O.OOl.v 2 - 0.002. 
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EXAMPLE 2 


Step 3. Solution of the initial value problem . Setting a = 0 and using the first initial condition gives 
v(0) = A — 0.002 = 0. hence A = 0.002. By differentiation and from the second initial condition, 

y = yh + Yp = -A sin a- + B cos.v + 0.002v and y'( 0) = B = 1.5. 


This gives the answer (Fig. 49) 

y = 0.002 cos a + 1.5 sin a + 0.001 a 2 - 0.002. 

Figure 49 shows y as well as the quadratic parabola y p about which y is oscillating, practically like a sine curve 
since the cosine lerm is smaller by a factor of about I/I 000. H 

y 
2 
1 
0 
-1 

Fig. 49. Solution in Example 1 



Application of the Modification Rule (b) 

Solve the initial value problem 

(6) y" + 3y' + 2.25y = - 10 e~ 15x \ y(0) = 1, y'(0) = 0. 

Solution . Step 1. General solution of the homogeneous ODE . The characteristic equation of the 
homogeneous ODE is A 2 + 3A + 2.25 = (A + 1.5) 2 = 0. Hence the homogeneous ODE has the general 
solution 

>'h = (ci + c 2 x)e~ l5x . 


Step 2. Solution y p of the nonhomogeneous ODE . The function <?” L5a on the right would normally require 
the choice Ce~ 15x . But we see from y h that this function is a solution of the homogeneous ODE. which 
corresponds to a double root of the characteristic equation. Hence, according to the Modification Rule we have 
to multiply our choice function by a 2 . That is. we choose 

y p = C.vV 15r Then y p = C(2v - \.5x 2 )e~ l5x , y p = C( 2 - 3a - 3.v + 2.25a 2 )<T 15x . 

We substitute these expressions into the given ODE and omit the factor e“ 1,5x *. This yields 

C(2 - 6a + 2.25a 2 ) + 3C(2v - 1.5a 2 ) + 2.25Ca 2 = -10. 

Comparing the coefficients of a 2 , a. a 0 gives 0 = 0, 0 = 0. 2C = — 10, hence C = -5. This gives the solution 
\p = -5a 2 «?” 1,5x . Hence the given ODE has the general solution 

.V = y h + v p = (ci + c 2 x)e- 15x - Sx 2 e-' * x . 

Step 3. Solution of the initial value problem . Setting a = 0 in y and using the first initial condition, we obtain 
y(0) = Ci = I. Differentiation of y gives 

y' = (c 2 - 1 .5c*! - \.5c 2 x)e~ 13x - lOw -1 *®* + 1.5x 2 e~ 13x . 

From this and the second initial condition we have y'(0) = c 2 — l.5cj = 0. Hence c 2 = 1.5c] = 1.5. This 
gives the answer (Fig. 50) 

y = (l + 1.5 a) e~ L5x - 5x 2 e~ i 5x = (1 + |.5a - 5.v 2 )e“ 1,5a: . 


The curve begins with a horizontal tangent, crosses the .v-axis at x = 0.6217 (where 1 + I. 5a - 5a 2 = 0) and 
approaches the axis from below as a increases. ■ 



82 


CHAP. 2 Second-Order Linear ODEs 


EXAMPLE 3 



Application of the Sum Rule (c) 

Solve the initial value problem 

(7) y" + 2 y 4- 5y = e 05x + 40 cos IO.y - 190 sin IO.t, y(0) = 0.16, y'( 0) = 40.08. 

Solution. Step 1. General solution of the homogeneous ODE . The characteristic equation 
A 2 4- 2A 4- 5 = (A + 1 + 2/)(A + 1 - 2/) = 0 

shows that a real general solution of the homogeneous ODE is 

y h = e~ x ( A cos 2v + B sin 2r). 

Step 2. Solution of the nonhomogeneous ODE. We write y p = y pl + y P 2 » where y pl corresponds to the 
exponential term and y P 2 to the sum of the other two terms. We set 

Vpl = Ce°' 5x . Then = 0.5Ce os * and Vpj = 0.25G? as *. 

Substitution into the given ODE and omission of the exponential factor gives (0.25 -1- 2 • 0.5 4~ 5)C — 1, hence 
C = 1/6.25 = 0.16, and y pl = O.I6e os *. 

We now set y ]>2 = K cos 10.v + M sin 1 OLv, as in Table 2.1, and obtain 

)’p 2 = — 10/C sin 10.v + 10 M cos 1 0.v, y p2 = —100 K cos 10.v — 100M sin 10.V. 

Substitution into the given ODE gives for the cosine terms and for the sine terms 

-100 K + 2- 10M + 5K = 40. -100 M - 2 • I0AT + 5M = -190 

or, by simplification, 

-95 K + 20 M == 40. -20 K - 95M = - 190. 

The solution is K = 0, M = 2. Hence y p2 = 2 sin 10.v. Together, 

y = 37 1 + 3 ? pi + yp2 ~ e ~ X (A cos 2v + B sin 2v) + 0.16e o ‘ 5a; + 2 sin lO.r. 

Step 3. Solution of the initial value problem . From y and the first initial condition. y(0) = A 4- 0.16 = 0.16, 
hence A = 0. Differentiation gives 

y = *>“*(—/} cos 2v - B sin 2v — 2A sin 2v + 2 B cos 2v) 4- 0.08# 0 * 5 * 4* 20 cos 10.v. 

From this and the second initial condition we have y\ 0) = — A 4- 2B 4- 0.08 4- 20 = 40.08, hence B = 10. 
This gives the solution (Fig. 5 1 ) 

>• = \0e~ x sin 2v 4- 0A6e O3x 4* 2 sin 10.v. 

Tlie first term goes to 0 relatively fast. When x = 4, it is practically 0, as the dashed curves ± lOe”* 4- 0.16e o,5,T 
show. From then on, the last term, 2 sin !0.v, gives an oscillation about 0.16* 0,5 *, the monotone increasing 
dashed curve. ■ 
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y 

10 
8 
6 
4 
2 

“0 
-2 

-4 

Fig. 51. Solution in Example 3 

Stability. The following is important. If (and only if) all the roots of the characteristic 
equation of the homogeneous ODE y" 4- ay f 4- by = 0 in (4) are negative, or have a negative 
real part, then a general solution y h of this ODE goes to 0 as a* — > *», so that the “transient 
solution” v = y h 4- y p of (4) approaches the “steady-state solution” y p . In this case the 
nonhomogeneous ODE and the physical or other system modeled by the ODE are called 
stable; otherwise they are called unstable. For instance, the ODE in Example 1 is unstable. 
Basic applications follow in the next two sections. 





1 1-14 1 GENERAL SOLUTIONS OF 
NONHOMOGENEOUS ODEs 

Find a (real) general solution. Which rule are you using? 
(Show each step of your calculation.) 

1. y" + 3v' + 2y = 30e 2x 

2. y" 4- Ay' 4- 3.75y = 109 cos 5 a 

3. y" - 16y = 19.2<? 4 * + 60e x 

4. y " -I- 9v = cos a 4- ^ cos 3a* 

5. y" + / - 6y = 6 a* 3 - 3a 2 4- 12a 

6. y" 4- Ay' 4- 4y = e~ 2x sin 2.v 

7. y" 4- 6y' 4- 73y = 80*r T cos 4a 

8. y" 4- l 0 y ' 4- 25y =100 sinh 5 a 

9. y" - 0.1 6y = 32 cosh 0.4 a 

10. y" + Ay' 4- 6.25y = 3.125(a 4- l) 2 

11. y" 4- 1.44y = 24 cos 1.2a 

12. y" 4- 9y = 18a 4- 36 sin 3a 

13. y" 4- 4v' 4- 5v = 25a 2 4- 13 sin 2 a 

14. y" 4- 2 y f 4* y = 2 a sin A 

15-20 1 INITIAL VALUE PROBLEMS FOR 
NONHOMOGENEOUS ODEs 

Solve the initial value problem. State which rules you are 
using. Show each step of your calculation in detail. 

15. y" 4- 4y = 16 cos 2a, y(0) = 0, y'(0) = 0 


16. 

tt 

y ~ 

3y' + 

2.25y = 27(.v 2 - .v), 


y(0) 

= 20, 

y‘ (0) = 

= 30 

17. 

y" + 

0.2y' 

+ 0.26)’ = 1 

.22e 05x , 


y(0) 

= 3.5, 

y'(0) 

= 0.35 

18. 

// 

>• ~ 

2y ' = 

I2e ix - Se 

~2x 

? 


>(0) 

= -2, 

v'( 0) 

= 12 

19. 

;/ 

y - 

r 

y ~ 

12v = 144.v 3 

4- 12.5, 


>•(0) 

= 5, 

v'(0) = 

-0.5 

20. 

n i 

y + 

2 y + 

10y = 17 sin a - 37 


y(0) 

= 6.6, 

/(0) 

= -2.2 


21. WRITING PROJECT. Initial Value Problem. Write 
out all the details of Example 3 in your own words. 
Discuss Fig. 51 in more detail. Why is it that some of 
the “half- waves” do not reach the dashed curves, 
whereas others preceding them (and, of course, all later 
ones) excede the dashed curves? 

22. TEAM PROJECT. Extensions of the Method of 
Undetermined Coefficients, (a) Extend the method 
to products of the function in Table 2.1. (b) Extend 
the method to Euler-Cauchy equations. Comment on 
the practical significance of such extensions. 

23. CAS PROJECT. Structure of Solutions of Initial 
Value Problems. Using the present method, find, graph, 
and discuss the solutions y of initial value problems of 
your own choice. Explore effects on solutions caused by 
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changes of initial conditions. Graph y p , y, y — y p 
separately, to see the separate effects. Find a problem in 
which (a) the part of y resulting from y h decreases to zero, 
(b) increases, (c) is not present in the answer y. Study a 


problem with y(0) = 0, /(0) = 0. Consider a problem 
in which you need the Modification Rule (a) for a simple 
root, (b) for a double root. Make sure that your problems 
cover all three Cases I, FI, III (see Sec. 2.2). 


2.8 Modeling: Forced Oscillations. Resonance 

In Sec. 2.4 we considered vertical motions of a mass-spring system (vibration of a mass 
m on an elastic spring, as in Figs. 32 and 52) and modeled it by the homogeneous linear 
ODE 

(1) my" 4- cy' + ky = 0. 

Here y(t) as a function of time t is the displacement of the body of mass m from rest. 
These were free motions, that is, motions in the absence of external forces (outside forces) 
caused solely by internal forces , forces within the system. These are the force of inertia 
my'\ the damping force cy' (if c > 0), and the spring force ky acting as a restoring force. 

We now extend our model by including an external force, call it /*(/), on the right. Then 
we have 

(2*) my" 4- cy + ky = r(t). 

Mechanically this means that at each instant t the resultant of the internal forces is in 
equilibrium with /*(/). The resulting motion is called a forced motion with forcing 
function /*(/), which is also known as input or driving force, and the solution y(r) to be 
obtained is called the output or the response of the system to the driving force. 

Of special interest are periodic external forces, and we shall consider a driving force 
of the form 


/*(;) = F 0 cos cot ( F 0 > 0 y co > 0). 

Then we have the nonhomogeneous ODE 

(2) my" + cy' + ky — F 0 cos cot. 

Its solution will familiarize us with further interesting facts fundamental in engineering 
mathematics, in particular with resonance. 





SEC 2.8 Modeling: Forced Oscillations. Resonance 


85 


Solving the Nonhomogeneous ODE (2) 

From Sec. 2.7 we know that a general solution of (2) is the sum of a general solution y h 
of the homogeneous ODE (I) plus any solution y p of (2). To find y p , we use the method 
of undetermined coefficients (Sec. 2.7), starting from 

(3) y p (t) = a cos cot + b sin cot. 


By differentiating this function (chain rule!) we obtain 

Vp = —coa sin cot + cob cos cot , 
y p = —co 2 a cos cot — co 2 b sin cot. 

Substituting v p? y p , and y p into (2) and collecting the cosine and the sine terms, we get 

[(& — mco 2 )a + cocb] cos cot + [—coca 4- ( k - mo?)b] sin cot = F 0 cos cot . 

The cosine terms on both sides must be equal, and the coefficient of the sine term on the 
left must be zero since there is no sine term on the right. This gives the two equations 

( k — met?) a + cocb = F 0 

(4) 

—coca + (k - mco )b = 0 

for determining the unknown coefficients a and b . This is a linear system. We can solve 
it by elimination. To eliminate b , multiply the first equation by k — mo? and the second 
by —coc and add the results, obtaining 

(k — mo?) 2 ci + o?c 2 a = F 0 (k — mco 2 ). 

Similarly, to eliminate a , multiply the first equation by coc and the second by k — mo? 
and add to get 

o?c 2 b 4* ( k — mo?) 2 b — F 0 coc. 

If the factor (k — mo?) 2 + o?c 2 is not zero, we can divide by this factor and solve for a 
and b , 

_ k — mo? _ coc 

(k — mo?) 2 -f o?c 2 ' b (fc — fjjo?) 2 + o?c 2 

If we set \SUrn = co Q (> 0) as in Sec. 2.4, then k = mco 2 and we obtain 


(5) a = F 0 


m(co 0 2 — o?) 


m?(o)(? - o?) 2 + o?c 


. 2„2 


b = F 0 


coc 

2 _ 


m ( co 0 - co ) 4- co c 


. 2^2 


We thus obtain the general solution of the nonhomogeneous ODE (2) in the form 


( 6 ) 


y(0 = v,,.(r) + y p (t). 
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Here y h is a general solution of the homogeneous ODE (1) and y p is given by (3) with 
coefficients (5). 

We shall now discuss the behavior of the mechanical system, distinguishing between 
the two cases c = 0 (no damping) and c > 0 (damping). These cases will correspond to 
two basically different types of output. 


Case 1. Undamped Forced Oscillations. Resonance 

If the damping of the physical system is so small that its effect can be neglected over the 
time interval considered, we can set c = 0. Then (5) reduces to a = F 0 f[m((o 0 2 — co 2 )] 
and b = 0. Hence (3) becomes (use co 2 = k!m) 

m cos - ■ *[i4W] “• “• 

Here we must assume that co 2 ^ co 2 \ physically, the frequency coKflif) [cycles/sec] of the 
driving force is different from the natural frequency co q I(2tt) of the system, which is the 
frequency of the free undamped motion [see (4) in Sec. 2.4]. From (7) and from (4*) in 
Sec. 2.4 we have the general solution of the “undamped system” 


( 8 ) 


y(t) = C cos (<o 0 t — S) + 


m(<o 0 2 — (o 2 ) 


cos cot. 


We see that this output is a superposition of two harmonic oscillations of the frequencies 
just mentioned. 

Resonance. We discuss (7). We see that the maximum amplitude of y p is (put 
cos cot = 1) 

<9) where p - 1 - (L»)* ■ 

a Q depends on co and co 0 . If co — > co 0 , then p and a 0 tend to infinity. This excitation of 
large oscillations by matching input and natural frequencies (co = co 0 ) is called 
resonance, p is called the resonance factor (Fig. 53), and from (9) we see that p/k = a 0 /F 0 
is the ratio of the amplitudes of the particular solution y p and of the input F 0 cos cot. 
We shall see later in this section that resonance is of basic importance in the study of 
vibrating systems. 

In the case of resonance the nonhomogeneous ODE (2) becomes 


( 10 ) 


n . o Fo 
y + co 0 y = — cos co 0 t. 
m 


Then (7) is no longer valid, and from the Modification Rule in Sec. 2.7 we conclude that 
a particular solution of (10) is of the form 


y P (0 = t(a cos <o 0 t + b sin co 0 t). 
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By substituting this into (10) we find a - 0 and b = F 0 !(2mco 0 ). Hence (Fig. 54) 


( 11 ) 


) J p(0 = 


Fq 

2 mo) 0 


t sin o) 0 i. 


We see that because of the factor t the amplitude of the vibration becomes larger and 
larger. Practically speaking, systems with very little damping may undergo large vibrations 
that can destroy the system. We shall return to this practical aspect of resonance later in 
this section. 



Fig. 54. Particular solution in the case of resonance 


Beats. Another interesting and highly important type of oscillation is obtained if co is 
close to a> 0 . Take, for example, the particular solution [see (8)] 


( 12 ) 





(cos cot — cos (O 0 t ) 


(0) (O 0 ). 


Using (12) in App. 3.1, we may write this as 


y(t) = 



Since a> is close to co 0 , the difference <o Q - co is small. Hence the period of the last sine 
function is large, and we obtain an oscillation of the type shown in Fig. 55, the dashed 
curve resulting from the first sine factor. This is what musicians are listening to when 
they tune their instruments. 
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Fig. 55. Forced undamped oscillation when the difference 
of the input and natural frequencies is small ("beats”) 

Case 2. Damped Forced Oscillations 

If the damping of the mass-spring system is not negligibly small, we have c > 0 and a 
damping term cy f in (1) and (2). Then the general solution y h of the homogeneous ODE 
(1) approaches zero as / goes to infinity, as we know from Sec. 2.4. Practically, it is zero 
after a sufficiently long time. Hence the “transient solution” (6) of (2), given by 
v = y h 4- y p? approaches the “steady-state solution” y p . This proves die following. 


THEOREM 1 


Steady-State Solution 

After a sufficiently long time the output of a damped vibrating system under a purely 
sinusoidal driving force [see (2)] will practically be a harmonic oscillation whose 
frequency is that of the input . 


Amplitude of the Steady-State Solution. Practical Resonance 

Whereas in the undamped case the amplitude of y p approaches infinity as a> approaches 
c o 0 , this will not happen in the damped case. In this case the amplitude will always be finite. 
But it may have a maximum for some a> depending on the damping constant c. This may 
be called practical resonance. It is of great importance because if c is not too large, then 
some input may excite oscillations large enough to damage or even destroy the system. 
Such cases happened, in particular in earlier times when less was known about resonance. 
Machines, cars, ships, airplanes, bridges, and high-rising buildings are vibrating mechanical 
systems, and it is sometimes rather difficult to find constructions that are completely free 
of undesired resonance effects, caused, for instance, by an engine or by strong winds. 

To study the amplitude of y p as a function of a>, we write (3) in the form 

(13) y p (t) = C* cos (cot - 7]). 

C* is called the amplitude of y p and rj the phase angle or phase lag because it measures 
the lag of the output behind the input. According to (5). these quantities are 


C*(co) = V^TV 2 = , 0 , 

m 2 ( coq 2 — a) 2 ) 2 4- a ) 2 c 2 


tan Tj(w) = 


b_ 

a 


(OC 

m(a> o 2 " «> 2 ) 


( 14 ) 
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Let us see whether C*(co) has a maximum and, if so, find its location and then its size. 
We denote the radicand in the second root in C* by R. Equating the derivative of C* to 
zero, we obtain 


= F °(" i * _3/2 )[2'»W - *> 2 )(-2 .») + 2a>c 2 ]. 

The expression in the brackets [. . .] is zero if 

(15) c 2 = 2m 2 (o> 0 2 _ a?) (a> 0 2 = Urn). 


By reshuffling terms we have 

2m 2 (o 2 = 2 m 2 a) 0 2 — c 2 = 2 mk — c 2 . 

The right side of this equation becomes negative if c 2 > 2 mk, so that then (15) has no 
real solution and C* decreases monotone as <o increases, as the lowest curve in Fig. 56 
on p. 90 shows. If c is smaller, c 2 < 2 mk, then (15) has a real solution <o = o> max , where 

c 2 

(15*) ax = a> 0 2 - — — j . 


From (15*) we see that this solution increases as c decreases and approaches <o 0 
as c approaches zero. See also Fig. 56. 

The size of C*(o> max ) is obtained from (14), with co 2 = given by (15*). For this 
a) 2 we obtain in the second radicand in (14) from (15*) 

c 4 

m 2 ((O 0 2 - &4ax) 2 = ^2 and «4ax‘ -2 

The sum of the right sides of these two formulas is 

(c 4 + 4 m 2 (o 2 c 2 — 2c 4 ) /(4m 2 ) = c 2 (4m 2 o> 0 2 
Substitution into (14) gives 



- c 2 )/(4m 2 ). 


( 16 ) 


C*(*> max) = 


2mF 0 

c\/ Am 2 o) 2 ~ c 2 


We see that C*(o> max ) is always finite when c > 0, Furthermore, since the expression 

c 2 4m 2 o> 0 2 — c 4 = c 2 {Anik — c 2 ) 

in the denominator of (16) decreases monotone to zero as c 2 (< 2 mk) goes to zero, the 
maximum amplitude (16) increases monotone to infinity, in agreement with our result in 
Case 1 . Figure 56 shows the amplification C*!F 0 (ratio of the amplitudes of output and 
input) as a function of co for m = 1, k = 1, hence co 0 = 1, and various values of the 
damping constant c. 
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Figure 57 shows the phase angle (the lag of the output behind the input), which is less 
than -77/2 when co < co 0i and greater than tt/2 for a) > <o Q . 




Fig. 56. Amplification C*/F 0 as a function 
of co for m = 1, k = 1, and various values 
of the damping constant c 


Fig. 57. Phase lag rj as a function of <a for 
m = 1, k = 1, thus oj 0 = 1, and various 
values of the damping constant c 
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STEADY-STATE SOLUTIONS 


Find the steady-state oscillation of the mass-spring system 
modeled by the given ODE. Show the details of your 
calculations. 

1. y n + 6y' + 8y =• 130 cos 3 / 

2. 4 y" + By' + 1 3y = 8 sin 1.5/ 

3* y" + y 9 + 4.25y = 221 cos 4.5/ 

4. y" + 4y' + 5y = cos / - sin / 

5. (D 2 + 2 Z> + / )v = -sin 2 / 

6. (D 2 + 40 + 3 1)y = cos t + § cos 3/ 

7. (O 2 + 60 + ]8/)y = cos 3/ - 3 sin 3/ 

8. (O 2 + 20 + 10/)y = -25 sin 4/ 


9-14 


TRANSIENT SOLUTIONS 


Find the transient motion of the mass-spring system 
modeled by the given ODE. (Show the details of your 
work.) 


9. y" + 2y' + 0.75y = 13 sin / 

10. y n + 4y ; + 4v = cos 4/ 

11. 4 y" + 12 / + 9 y = 75 sin 3 1 

12. (D 2 + 50 + 4/)y = sin 2/ 

13. (O 2 + 30 + 3.25/)y = 13 - 39 cos It 

14. (O 2 + 20 + 5/)y = 1 + sin / 


15-20 


INITIAL VALUE PROBLEMS 


Find the motion of the mass-spring system modeled by 
the ODE and initial conditions. Sketch or graph the 
solution curve. In addition, sketch or graph the curve of 


y — y p to see when the system practically reaches the 

steady state. 

15. y" + 2y' + 26y = 13 cos 3/, 

>< 0) «’l f /( 0) = 0.4 

16. y" + 64y = cos/, y(0) = 0, y r ( 0) = 1 

17. y" + 6y* + 8v = 4 sin 2/, y(0) = 0.7, 

y'(0) = -11.8 

18. (O 2 + 20 + I)y = 75(sin t - | sin It + ^ sin 3/), 

.V(0) = 0, ’ /( 0) = 1 

19. (40 2 + 120 + 1 3/ ) y = 12 cos / - 6 sin /, 

y( 0) = 1. /(O) = -1 

20. y* + 25y = 99 cos 4.9/, y( 0) = 2, / ( 0) = 0 

21. (Beats) Derive the formula after (12) from (12). Can 
there be beats if the system has damping? 

22. (Beats) How does the graph of the solution in Prob. 20 
change if you change (a) y(0), (b) the frequency of the 
driving force? 

23. WRITING PROJECT. Free and Forced Vibrations. 
Write a condensed report of 2-3 pages on the most 
important facts about free and forced vibrations. 

24. CAS EXPERIMENT. Undamped Vibrations, 
(a) Solve the initial value problem y” + y = cos (at , 
co 2 i* 1, y(0) = 0, /(O) = 0. Show that the solution 
can be written 
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(b) Experiment with (17) by changing to to see the 
change of the curves from those for small to (> 0) to 
beats, to resonance and to large values of to (see Fig. 58). 


i , 

fl 


IA 

f\| 


■¥ 

t/i 

On 

\ 


20k 


co = 0.2 




Fig. 58. Typical solution curves in CAS Experiment 24 


25. TEAM PROJECT. Practical Resonance, (a) Give 
a detailed derivation of the crucial formula (16). 

(b) By considerin g dC *ldc show that C*(o> max ) 
increases as c (= V2 mk) decreases. 

(c) Illustrate practical resonance with an ODE of your 
own in which you vary c, and sketch or graph 
corresponding curves as in Fig. 56. 

(d) Take your ODE with c fixed and an input of two 
terms, one with frequency close to the practical 
resonance frequency and the other not. Discuss and 
sketch or graph the output. 

(e) Give other applications (not in the book) in which 
resonance is important. 

26. (Gun barrel) Solve 

fl — r 2 /7T 2 if 0 ^2 t ^ TT 

y" + y = 

l 0 if t > 7T, 

0) = /( 0) = o. 

This models an undamped system on which a force F 
acts during some interval of time (see Fig. 59), for 
instance, the force on a gun barrel when a shell is fired, 
the barrel being braked by heavy springs (and then 
damped by a dashpot, which we disregard for 
simplicity). Hint. At 7rboth y andy' must be continuous. 


m= 1 *=1 

F i ■■■ **!■■ AAAA 

| 

F= l-tz/itj 

~~^\y=o 

— 

i i 



[ * 

t 


Fig. 59. Problem 26 


2 .< Modeling: Electric Circuits 

Designing good models is a task the computer cannot do. Hence setting up models has 
become an important task in modem applied mathematics. The best way to gain experience 
is to consider models from various fields. Accordingly, modeling electric circuits to be 
discussed will be profitable for all students , not just for electrical engineers and computer 
scientists. 

We have just seen that linear ODEs have important applications in mechanics (see also 
Sec. 2.4). Similarly, they are models of electric circuits, as they occur as portions of large 
networks in computers and elsewhere. The circuits we shall consider here are basic 
building blocks of such networks. They contain three kinds of components, namely, 
resistors, inductors, and capacitors. Figure 60 on p. 92 shows such an i?LC-circuit, as 
they are called. In it a resistor of resistance R ft (ohms), an inductor of inductance L H 
(henrys), and a capacitor of capacitance C F (farads) are wired in series as shown, and 
connected to an electromotive force E(t) V (volts) (a generator, for instance), sinusoidal 
as in Fig. 60, or of some other kind. R, L, C, and E are given and we want to find the 
current I(t) A (amperes) in the circuit. 
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C 



£(*) = 2? 0 siiuof 

Fig. 60. RLC-circuit 


An ODE for the current I(t) in the /?LC-circuit in Fig. 60 is obtained from the following 
law (which is the analog of Newton’s second law, as we shall see later). 

KirchhofTs Voltage Law (KVL). 7 The voltage (the electromotive force) impressed on 
a closed loop is equal to the sum of the voltage drops across the other elements of the 
loop. 

In Fig. 60 the circuit is a closed loop, and the impressed voltage E(t) equals the sum 
of the voltage drops across the three elements R y L, C of the loop. 

Voltage Drops. Experiments show that a current / flowing through a resistor, inductor 
or capacitor causes a voltage drop (voltage difference, measured in volts) at the two ends; 
these drops are 

RI (Ohm’s law) Voltage drop for a resistor of resistance R ohms (fl), 

/ dl 

LI — L — Voltage drop for an inductor of inductance L henry s (H), 
dt 

Q 

— Voltage drop for a capacitor of capacitance C farads (F). 

Here Q coulombs is the charge on the capacitor, related to the current by 
/(/) = , equivalently, Q(t ) = J I(t) dt. 

This is summarized in Fig. 61. 

According to KVL we thus have in Fig. 60 for an /?LC-circuit with electromotive force 
E(t) = E 0 sin cot (E 0 constant) as a model the u integro-differential equation” 

(1 ') LI' +RI+± fldt = E(t) = E 0 sin cat. 


7 GUSTAV ROBERT K1RCHHOFF (1824-1887). German physicist. Later we shall also need KirchhofTs 
current law (KCL): 

At any point of a circuit, the sum of the inflowing currents is equal to the sum of the outflowing currents. 

The units of measurement of electrical quantities are named after ANDRJ; MARIE AMPERE (1775-1836), 
French physicist, CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer, 
MICHAEL FARADAY (1791-1867), English physicist, JOSEPH HENRY (1797-1878), American physicist. 
GEORG SIMON OHM (1789-1854), German physicist, and ALESSANDRO VOLTA (1745-1827), Italian 
physicist. 
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Name 

Symbol 


Notation 

Unit 

Voltage Drop 

Ohm’s resistor 

Inductor 

Capacitor 

—WA~ 

—)h- 

R 

L 

C 

Ohm’s resistance 

Inductance 

Capacitance 

ohms (ft) 
henrys (H) 
farads (F) 

RI 

T dl 

L dt 
QIC 


Fig. 61. Elements in an RLC- circuit 


To get rid of the integral, we differentiate (1 ') with respect to r, obtaining 

(1) U" + Rl' + ^ I = E f (t) = E 0 oj cos cot. 

This shows that the current in an /?LC-circuit is obtained as the solution of this 
nonhomogeneous second-order ODE (1) with constant coefficients. 

From (1 '), using / = Q\ hence I r = g”, we also have directly 

(l") LQ" +RQ r + -^Q = E 0 sin cot. 

But in most practical problems the current I{t) is more important than the charge Q(t), 
and for this reason we shall concentrate on (1) rather than on (l"). 

Solving the ODE (1) for the Current 
Discussion of Solution 

A general solution of (1) is the sum / = I h 4* 7 p , where l h is a general solution of the 
homogeneous ODE corresponding to (1) and I p is a particular solution of (1). We first 
determine I p by the method of undetermined coefficients, proceeding as in the previous 
section. We substitute 

(2) I p = a cos cot 4- b sin cot 

Ip = co(— a sin cot + b cos cot) 

Ip = co 2 (—a cos cot — b sin cot) 

into (1). Then we collect the cosine terms and equate them to E 0 co cos cot on the right, 
and we equate the sine terms to zero because there is no sine term on the right, 

Lco 2 (—a) + Rcob -l- ci/C = E 0 co (Cosine terms) 

Lco 2 (-b) -I- Rco(-a) + b/C = 0 (Sine terms). 

To solve this system for a and b, we first introduce a combination of L and C, called the 
reactance 


S = 



(3) 
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Dividing the previous two equations by co, ordering them, and substituting 5 gives 

—Sci + Rb = Eq 
-R o -5/7 = 0. 

We now eliminate b by multiplying the first equation by 5 and the second by /?, and 
adding. Then we eliminate a by multiplying the first equation by R and the second by 
—5, and adding. This gives 


-(S 2 + R 2 )a = E 0 S, ( R 2 + &)b = E 0 R. 

in any practical case the resistance R is different from zero, so that we can solve for a 
and b , 


(4) 


-E 0 S 
R 2 + 5 2 ’ 


E 0 R 

r 2 + s 2 ‘ 


Equation (2) with coefficients a and b given by (4) is the desired particular solution I p of 
the nonhomogeneous ODE (1) governing the current / in an RLC - circuit with sinusoidal 
electromotive force. 

Using (4), we can write 7 p in terras of “physically visible” quantities, namely, amplitude 
I Q and phase lag 9 of the current behind the electromotive force, that is, 

(5) 7 P (0 = Jq sin (cot - 6) 

where [see (14) in App. A3.1] 


/„ = Va 2 + b 2 = 


Eq 

Vr 2 + S 2 ’ 


tan 6 = 



S_ 
R ' 


The quantity V/? 2 + S 2 is called the impedance. Our formula shows that the impedance 
equals the ratio E 0 /l 0 . This is somewhat analogous to E/I = R (Ohm’s law). 

A general solution of the homogeneous equation corresponding to (1) is 

*h = C\e + c 2 e 


where and A 2 are the roots of the characteristic equation 


, R 1 

A + 7 A + Ic ' °' 

We can write these roots in the form A x = — a + (3 and A 2 = -a — j6, where 


R 


/?■ 


1 




2L 


4L‘ 


LC 2 L 


4L 


Now in an actual circuit, R is never zero (hence R > 0). From this it follows that I h 
approaches zero, theoretically as t — *■ «>, but practically after a relatively short time. (This 
is as for the motion in the previous section.) Hence the transient current / = I h + I p tends 
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EXAMPLE 1 


to the steady-state current / p , and after some time the output will practically be a harmonic 
oscillation, which is given by (5) and whose frequency is that of the input (of the 
electromotive force). 

RLC- Circuit 

Find the current /(/) in an #LC-circuit with R = 1 1ft (ohms), L = O.i H (henry), C = 1(T 2 F (farad), which 
is connected to a source of voltage £(/) = 100 sin 400; (hence 63§Hz = 63 § cycles/sec, because 
400 = 63§ • 2 tt). Assume that current and charge are zero when / — 0. 

Solution . Step 7. General solution of the homogeneous ODE . Substituting R, L, C, and the derivative £'(/) 
into (1), we obtain 


0.L/" + II/' + 100/ = 100 -400 cos 400/. 

Hence the homogeneous ODE is 0.1/" + 1 1/' 4- 100/ = 0. Its characteristic equation is 

0.1 A 2 + 1 1 A + 100 = 0. 

The roots are A x = - 10 and A 2 = -100. The corresponding general solution of the homogeneous ODE is 

/*<«) - <-\e ~ l0t + c 2 e~ 100t . 

Step 2. Particular solution I p of (1). We calculate die reactance .S = 40 — 1/4 = 39.75 and the steady-state 
current 


I p (t) = a cos 400/ + h sin 400/ 


witli coefficients obtained from (4) 


a 


-100-39.75 
II 2 + 39.75 2 


= -2.3368, 


b = 


100 - 11 
1 1 2 + 39.75 2 


= 0.6467. 


Hence in our present case, a general solution of the nonhomogeneous ODE (1) is 
(6) /(/) = c,* -10 * + c 2 c -loot - 2.3368 cos 400/ + 0.6467 sin 400/. 


Step 3. Particular solution satisfying the initial conditions . How to use Q(0) = 0? We finally determine c x 
and c 2 from the initial conditions 1(0) = 0 and Q( 0) = 0. From the first condition and (6) we have 

(7) /( 0) = t'i + c 2 - 2.3368 = 0, hence c 2 = 2.3368 — c\. 

Furthermore, using (l') with / = 0 and noting that the integral equals Q(t) (see the formula before (l')), we 
obtain 


/ 1 , 

Ll ( 0) + /? • 0 + — -0 = 0, hence /'( 0) = 0. 

Differentiating (6) and setting / = 0, we thus obtain 

/'( 0) = 10c, - I00c'2 + 0 + 0.6467 • 400 = 0. hence - 10c, = 100(2.3368 - Cj) - 258.68. 

The solution of this and (7) is Ci = -0.2776, c 2 — 2.6144. Hence the answer is 

HO = -0.2776s - l0t + 2.6144c -1001 - 2.3368 cos 400/ + 0.6467 sin 400/. 

Figure 62 on p. 96 shows /(/) as well as / 7 >(/), which practically coincide, except for a very short time near 
/ = 0 because the exponential terms go to zero very rapidly. Thus after a very short time the current will 
practically execute harmonic oscillations of the input frequency 63§ Hz = 63 1 cycles/sec. Its maximum amplitude 
and phase lag can be seen from (5). which here takes the form 


/ p (/> = 2.4246 sin (400/ - 1.3008). 
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Fig. 62. Transient and steady-state currents in Example 1 


Analogy of Electrical and Mechanical Quantities 

Entirely different physical or other systems may have the same mathematical model 
For instance, we have seen this from the various applications of the ODE y = ky in 
Chap. 1 . Another impressive demonstration of this unifying power of mathematics is 
given by the ODE (1) for an electric jRLC-circuit and the ODE (2) in the last section for 
a mass-spring system. Both equations 

Li" + /?/' + —/ = E 0 (o cos cot and my" 4- cy' + ky = F 0 cos cot 

L 

are of the same form. Table 2.2 shows the analogy between the various quantities involved. 
The inductance L corresponds to the mass m and, indeed, an inductor opposes a change 
in current, having an "inertia effect” similar to that of a mass. The resistance R corresponds 
to the damping constant c % and a resistor causes loss of energy, just as a damping dashpot 
does. And so on. 

This analogy is strictly quantitative in the sense that to a given mechanical system we 
can construct an electric circuit whose current will give the exact values of the displacement 
in the mechanical system when suitable scale factors are introduced. 

The practical importance of this analogy is almost obvious. The analogy may be used 
for constructing an "electrical model” of a given mechanical model, resulting in substantial 
savings of time and money because electric circuits are easy to assemble, and electric 
quantities can be measured much more quickly and accurately than mechanical ones. 


Table 2.2 Analogy of Electrical and Mechanical Quantities 


Electrical System 

Mechanical System 

Inductance L 

Mass m 

Resistance R 

Damping constant c 

Reciprocal 1/C of capacitance 

Spring modulus k 

Derivative E 0 o> cos cot of 1 
electromotive force J 

Driving force F 0 cos cot 

Current I(t) 

Displacement y(t) 
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1. (RL- circuit) Model the £L-circuit in Fig. 63. Find the 
general solution when R , L, E are any constants. Graph 
or sketch solutions when L — 0.1 H, R = 5 

£ = 12 V. 

2. (ftL-circuit) Solve Prob. I when £ = £ 0 sin to/ and £, 
L, £ 0 , (o are arbitrary. Sketch a typical solution. 

3. (£C-circuit) Model the £C-circuit in Fig. 66. Find the 
current due to a constant £. 

4. (£C-circuit) Find the current in the RC-circuit in 
Fig. 66 with £ = £ 0 sin a>t and arbitrary R , C, £ 0 , and to. 



Fig. 63. RL-circuit 




A/W 


E(t) 

■II 

C 

Fig. 66. RC-circuit 



Fig. 67. Current 1 in Problem 3 


5. (LC-circuit) This is an RLC-circuit with negligibly 
small R (analog of an undamped mass-spring system). 
Find the current when L = 0.2 H, C =• 0.05 F, and 
£ = sin 1 V, assuming zero initial current and charge. 

6. (LC-circuit) Find the current when L — 0.5 H, 
C = 8 * 10“ 4 F, £ = t 2 V and initial current and charge 
zero. 

pM>l RiC-CIRCUITS (FIG. 60, P. 92) 

7. (Tuning) In tuning a stereo system to a radio station, 
we adjust the tuning control (turn a knob) that changes 
C (or perhaps L) in an RLC-circuit so that the amplitude 
of the steady-state current (5) becomes maximum. For 
what C will this happen? 

8. (Transient current) Prove the claim in the text that if 
R 0 (hence R > 0), then the transient current 
approaches I p as t — ► 

9. (Cases of damping) What are the conditions for an 
RLC-clrcuit to be (I) overdamped, (II) critically 
damped, (III) underdamped? What is the critical 
resistance R crit (the analog of the critical damping 
constant 2 V5)? 


10-12 1 Find the steady-state current in the RLC-circuit 
in Fig. 60 on p, 92 for the given data. (Show the details of 
your work.) 

10. R = 8 H, L = 0.5 H, C = 0.1 F, £ = 100 sin It V 

11. R = 1 H, L = 0.25 H, C = 5 • I0“ 5 F, £ = 110 V 

12. R = 2 a, L = 1 H, C = 0.05 F, £ = ^ sin 3/ V 


13-15 


Find the transient current (a general solution) 
in the RLC-circuit in Fig. 60 for the given data. (Show the 
details of your work.) 

13. R = 6 a L = 0.2 H, C = 0.025 F, £ = 110 sin 10/ V 

14. R = 0.2 0 , L = 0.1 H, C = 2 F, £ = 754 sin0.5/V 

15. R = 1/10 a L = 1/2 H, C = 100/13 F, 

£ = *T 4t ( 1.932 cos g/ + 0.246 sin £/) V 


16-18 


Solve the initial value problem for the 
£LC-circuit in Fig. 60 with the given data, assuming zero 
initial current and charge. Graph or sketch the solution. 
(Show the details of your work.) 
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16. R = 4 a L = 0.1 H, C = 0.025 F, E = 10 sin 10/ V 

17. R = 6 a, L = 1 H, C = 0.04 F, 

E = 600 (cos / 4 4 sin /) V 

18. K = 3.6 a L = 0.2 H, C = 0.0625 F, 

E = J64 cos 10/V 

19. WRITING PROJECT. Analogy of RLC-Circuits and 
Damped Mass-Spring Systems, (a) Write an essay of 
2-3 pages based on Table 2.2. Describe the analogy in 
more detail and indicate its practical significance. 

(b) What 7?LC-circuit with L — l H is the analog of 
the mass-spring system with mass 5 kg, damping 
constant 10 kg/sec, spring constant 60 kg/sec 2 , and 
driving force 220 cos 10/? 

(c) Illustrate the analogy with another example of your 
own choice. 

20. TEAM PROJECT. Complex Method for Particular 
Solutions, (a) Find a particular solution of the complex 
ODE 

(8) U" + RJ' + ~J= E 0 <oe iat (/ = V=T) 

by substituting 7 p = Ke tal (K unknown) and its 
derivatives into (8), and then take the real part 7 p of 7 p , 
showing that 7 p agrees with (2), (4). Hint. Use the Euler 
formula e ltait = cos cot 4 / sin c at [(11) in Sec. 2.2 with 
cot instead of /]. Note that E 0 co cos cot in (1) is the real 
part of E Q coe i<at in (8). Use / 2 = -1. 


(b) The complex impedance Z is defined by 


Z = 7? 4 iS = R 4 



Show that K obtained in (a) can be written as 


K = 


Eo 
iZ ‘ 


Note that the real part of Z is /?, the imaginary part is 
the reactance S , and the absolute value is the impedance 

|z| = V /? 2 + s 2 as defined before. See Fig. 68. 

(c) Find the steady-state solution of the ODE 
7" 4 27 / 4 37 = 20 cos /, first by the real method and 
then by the complex method, and compare. (Show the 
details of your work.) 

(d) Apply the complex method to an 7?LC-circuit of 
your choice. 



Fig. 68. Complex impedance Z 


2.10 Solution by Variation of Parameters 

We continue our discussion of nonhomogeneous linear ODEs 
(1) y" + p(x)y' + g(x)y = r(x). 

In Sec. 2.6 we have seen that a general solution of (1) is the sum of a general solution y h 
of the corresponding homogeneous ODE and any particular solution y p of (1 ). To obtain y p 
when r(x) is not too complicated, we can often use the method of undetermined coefficients , 
as we have shown in Sec. 2.7 and applied to basic engineering models in Secs. 2.8 and 2.9. 

However, since this method is restricted to functions r(x) whose derivatives are of a form 
similar to /*(*) itself (powers, exponential functions, etc.), it is desirable to have a method valid 
for more general ODEs (1), which we shall now develop. It is called the method of variation 
of parameters and is credited to Lagrange (Sec. 2.1). Here p, q, r in (1) may be variable 
(given functions of x ), but we assume that they are continuous on some open interval 7. 
Lagrange’s method gives a particular solution y p of (1) on 7 in the form 


( 2 ) 


y P {x) = dx + y 2 J^~ dx 
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EXAMPLE 1 


where y l9 y 2 form a basis of solutions of the corresponding homogeneous ODE 

(3) _v" + p(x)y f + q(x)y = 0 
on /, and W is the Wronskian of y l9 y 2i 

(4) W = y x y 2 ~ y 2 y[ (see Sec. 2.6). 


CAUTION! The solution formula (2) is obtained under the assumption that the ODE 
is written in standard form, with y" as the first term as shown in (1). If it starts with f{x)y", 
divide first by f(x). 

The integration in (2) may often cause difficulties, and so may the determination of y l9 
y 2 if (1) has variable coefficients. If you have a choice, use the previous method. It is 
simpler. Before deriving (2) let us work an example for which you do need the new 
method. (Try otherwise.) 


Method of Variation of Parameters 

Solve the nonhomogeneous ODE 

// 1 

v + v = sec x = . 

cos A 


Solution. A basis of solutions of the homogeneous ODE on any interval is y x = cos .v, y 2 = sin a. This 
gives the Wronskian 


W(yi, y 2 ) = cos a cos a — sin a (—sin a) = 1. 

From (2), choosing zero constants of integration, we get the particular solution of the given ODE 
y p - -cos a j sin x sec a dx + sin jt J cos a sec x dx 
= cos a In |cos a| 4- a sin x 


(Fig. 69). 


Figure 69 shows y p and its First term, which is small, so that a sin x essentially determines the shape of the curve 
of y p . (Recall from Sec. 2.8 that we have seen x sin a in connection with resonance, except for notation.) From 
y p and the general solution y tl = cjVi + C 2 .V 2 of the homogeneous ODE we obtain the answer 

y — y h + y v — (cj + In |cos .v|) cos a + (c 2 + a) sin a. 

Had we included integration constants -Ci, c 2 in (2), then (2) would have given the additional 
Ci cos x + c 2 sin a = ctfi + c 2 y 2 . that is, a general solution of the given ODE directly from (2). This will 
always be the case. M 



Fig. 69. Particular solution y p and its first term in Example 1 
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Idea of the Method. Derivation of (2) 

What idea did Lagrange have? What gave the method the name? Where do we use the 
continuity assumptions? 

The idea is to start from a general solution 

y/iW = ctfiC X) + C 2 y 2 (x) 

of the homogeneous ODE (3) on an open interval / and to replace the constants (“the 
parameters”) c t and c 2 by functions it(x) and u(jc); this suggests the name of the method. 
We shall determine u and v so that the resulting function 

(5) y p (x) = uQc)yi(x) + v(x)y 2 (x) 

is a particular solution of the nonhomogeneous ODE (1). Note that y h exists by Theorem 
3 in Sec. 2.6 because of the continuity of p and q on /. (The continuity of r will be used 
later.) 

We determine u and v by substituting (5) and its derivatives into (1). Differentiating 
(5), we obtain 


Vp = u'y x -1- uy[ + v f y 2 + vy 2 . 

Now y v must satisfy (1 ). This is one condition for two functions u and v. It seems plausible 
that we may impose a second condition. Indeed, our calculation will show that we can 
determine u and v such that y p satisfies (1) and u and v satisfy as a second condition the 
equation 

( 6 ) + v f y 2 = 0 . 

This reduces the first derivative y p to the simpler form 
(V) y p = uyi + vy 2 . 

Differentiating (7), we obtain 

(8) y p = «}'! + uyi + v y 2 + vy 2 . 

We now substitute ,v p and its derivatives according to (5), (7), (8) into (1). Collecting 
terms in u and terms in v, we obtain 

u(y" + py[ + qyi) + v{y' 2 + py 2 + qy 2 ) + u'y[ + v'y 2 = r. 

Since Vj. and y 2 are solutions of the homogeneous ODE (3), this reduces to 

(9a) u’yi + v'y' 2 = r. 

Equation (6) is 


(9b) 


uy i + v'y 2 = 0. 
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This is a linear system of two algebraic equations for the unknown functions u and v\ 
We can solve it by elimination as follows (or by Cramer’s rule in Sec. 7.6). To eliminate 
u\ we multiply (9a) by — y 2 and (9b) by y' 2 and add, obtaining 

- y^yi) = -yw, thus u'W = -y 2 r. 

Here, W is the Wronskian (4) of y lt y 2 . To eliminate u we multiply (9a) by y lt and (9b) 
by —y[ and add, obtaining 

v'iyiyL - 3’2.Vi) = .Vi' - , thus v'w = y x r. 

Since y lt y 2 form a basis, we have W ¥= 0 (by Theorem 2 in Sec. 2.6) and can divide by IV, 


( 10 ) 



v' = 


y£ 
w ' 


By integration, 



These integrals exist because r(*) is continuous. Inserting them into (5) gives (2) and 
completes the derivation. ■ 




1 1-171 GENERAL solution 

Solve the given nonhomogeneous ODE by variation of 
parameters or undetermined coefficients. Give a general 
solution. (Show the details of your work.) 

I n , 

• y + y = esc x 
2. y" — 4 y f 4 4 y = x 2 e x 
3. x 2 y" — 2xy ' + 2 y = jc 3 cos a* 

4. y" — 2y f + y = e x sin a 

5. y" 4- y = tan a 

6. x 2 y n — xy ' -f y = x In |a| 

7. y n + y = cos x + sec a 

8. y” — 4y ; 4- 4y = \2e 2 */x A 

9. (D 2 - 2D + I)y = a 2 4 a*- V 

10. (D 2 - I)y = 1 /cosh a 

11. (D 2 4 4/)y = cosh 2a 
12. (x 2 D 2 4 a D - \l)y = 3a** 1 4 3a 
13. (a 2 D 2 — 2a D 4 2 1)y — a 3 sin x 


14. (x 2 D 2 4 a D - 4 1)y = 1/a 2 

15. (D 2 4 I)y = sec a — 10 sin 5a 

16. (a 2 D 2 4 xD 4 (a 2 - |)/)y = a 3/2 cos a. 

Hint. To find y t , y 2 set y = ma“ 1/2 . 

17. (a 2 D 2 4 xD 4 (a 2 - \)I)y = a 3/2 sin a. 

Hint: As in Prob. 16. 

18. TEAM PROJECT. Comparison of Methods. The 
undetermined-coefficient method should be used 
whenever possible because it is simpler. Compare it 
with the present method as follows. 

(a) Solve y" 4 2 y’ — 15y = 17 sin 5a by both 
methods, showing all details, and compare. 

(b) Solve y" 4 9 y = r x 4 r 2 , r x = sec 3a, 
r 2 = sin 3a by applying each method to a suitable 
function on the right. 

(c) Invent an undetermined-coefficient method for 
nonhomogeneous Euler-Cauchy equations by 
experimenting. 
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APT ER r R EVf EVT O ITE5 TIONS AND PROBLEMS 


1. What general properties make linear ODEs particularly 
attractive? 

2. What is a general solution of a linear ODE? A basis of 
solutions? 

3. How would you obtain a general solution of a 
nonhomogeneous linear ODE if you knew a general 
solution of the corresponding homogeneous ODE? 

4. What does an initial value problem for a second-order 
ODE look like? 

5. What is a particular solution and why is it more common 
than a general solution as the answer to practical 
problems? 

6. Why are second-order ODEs more important in 
modeling than ODEs of higher order? 

7. Describe the applications of ODEs in mechanical 
vibrating systems. What are the electrical analogs of 
those systems? 

8. If a construction, such as a bridge, shows undesirable 
resonance, what could you do? 

9—18) GENERAL SOLUTION 

Find a general solution. Indicate the method you are using 

and show the details of your calculation. 

9. y" - 2 / - 8y = 52 cos 6 a 

10. y" + 6/ + 9v = e~ Zx - 27.v 2 

11. y" + 8/ + 25v = 26 sin 3 a 

12. yy" = 2y' 2 

13. (x 2 D 2 + 2 xD - 1 2/) v = 1 /a- 3 

14. (x 2 D 2 + 6a :D + 6/)y = a- 2 

15. ( D 2 -2 D + I)y = A-V 

16. (D 2 -4 D + 5 1)y = e 2x esc a 

17. (Z> 2 - 2D + 2/)y = e* esc a 

18. (4a 2 D 2 - 24.v/> + 49/)y = 36 a 5 

[19-25 1 INITIAL VALUE PROBLEMS 

Solve the following initial value problems. Sketch or graph 

the solution. (Show the details of your work.) 

19. y" + 5/ - I4v = 0. y(0) = 6, y'(0) = -6 

20. y" + 6/ + 18.v = 0. v(0) = 5. y'(0) = -21 

21. a 2 v" - av' - 24y = 0, y(l) = 15, /(l) = 0 

22. .v 2 y" + 15 a/ + 49v = 0, v(l) = 2, /(I) = - 1 1 

23. y" + 5y' + 6y = 108a 2 , y(0) = 1 8, /( 0) = -26 

24. y" + / + 2.5y = 13 cos a, v(0) = 8.0, 

y'(0) = 4.5 

25. (a 2 D 2 + a D - 4 1)y = a 3 , y( 1 ) = -4/5, 
y'(l) = 93/5 


1 26—34 1 APPLICATIONS 

26. Find the steady-state solution of the system in Fig. 70 
when m = 4. c = 4. k = 17 and the driving force is 
202 cos 3/. 

27. Find the motion of the system in Fig. 70 with mass 
0.25 kg, no damping, spring constant 1 kg/sec 2 , and 
driving force 15 cos 0.5r — 7 sin 1.5f nt, assuming zero 
initial displacement and velocity. For what frequency 
of the driving force would you get resonance? 

28. In Prob. 26 find the solution corresponding to initial 
displacement 10 and initial velocity 0. 

29. Show that the system in Fig. 70 with m = 4, c = 0, 
k = 36, and driving force 61 cos 3. If exhibits beats. 
Hint: Choose zero initial conditions. 

30. In Fig. 70 let m = 2. c = 6, k = 27. and 

r(t) = 10 cos cot. For what co will you obtain the steady- 
state vibration of maximum possible amplitude? 
Determine this amplitude. Then use this o) and the 
undetermined-coefficient method to see whether you 
obtain the same amplitude. 

31. Find an electrical analog of the mass-spring system in 
Fig. 70 with mass 0.5 kg, spring constant 40 kg/sec 2 , 
damping constant 9 kg/sec. and driving force 
102 cos 6/ nt. Solve the analog, assuming zero initial 
current and charge. 

32. Find the current in the /?LC-circuit in Fig. 71 
when L = 0.1 H. R = 20 H, C = 2 • 1(T 4 F, and 
E(r) = 110 sin 415/ V (66 cycles/sec). 

33. Find the current in the /?LC-circuit when L = 0.4 H. 
/? = 40 a C = 10“ 4 F, and E(t) = 220 sin 314f V 
(50 cycles/sec). 

34. Find a particular solution in Prob. 33 by the complex 
method. (See Team Project 20 in Sec. 2.9.) 



Fig. 70. Mass-spring Fig. 71. RLC- circuit 

system 
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Second-Order Linear ODEs 


Second-order linear ODEs are particularly important in applications, for instance, 
in mechanics (Secs. 2.4, 2.8) and electrical engineering (Sec. 2.9). A second-order 
ODE is called linear if it can be written 

(1) .v" + pWy' + q(x)y = r(x) (Sec. 2.1). 

(If the first term is, say, f(x)y” 9 divide by f(x) to get the “standard form” (1) with 
v" as the first term.) Equation (1) is called homogeneous if r(jt) is zero for all x 
considered, usually in some open interval; this is written r(x) = 0. Then 

(2) y" 4- p(x)y f 4- q{x)y = 0. 

Equation (1) is called nonhomogeneous if r(x) & 0 (meaning r(x) is not zero for 
some x considered). 

For the homogeneous ODE (2) we have the important superposition principle 
(Sec. 2.1) that a linear combination v = ky l 4- ly 2 of two solutions y l9 v 2 is again 
a solution. 

Two linearly independent solutions y l9 v 2 of (2) on an open interval / form a basis 
(or fundamental system) of solutions on /, and y = c\y x + c 2 .y 2 with arbitrary 
constants c l9 c 2 is a general solution of (2) on /. From it we obtain a particular 
solution if we specify numeric values (numbers) for c x and c 2 , usually by prescribing 
two initial conditions 

(3) y(x 0 ) = K 0 , y'(x 0 ) = K x (. x 0 , K 0 , K x given numbers; Sec. 2.1). 

(2) and (3) together form an initial value problem. Similarly for (1) and (3). 

For a nonhomogeneous ODE (1) a general solution is of the form 

(4) y = y h + y p (Sec. 2.7). 

Here y h is a general solution of (2) and y p is a particular solution of (1). Such a y p 
can be determined by a general method (variation of parameters, Sec. 2.10) or in 
many practical cases by the method of undetermined coefficients. The latter applies 
when (1) has constant coefficients p and q, and /*(.v) is a power of .v, sine, cosine, 
etc. (Sec. 2.7). Then we write (1) as 

(5) y" 4* ay 4 - by = r(x) (Sec. 2.7). 

The corresponding homogeneous ODE y 4- ay 1 4- by = 0 has solutions y = e**, 
where A is a root of 


( 6 ) 


A 2 4- a\ 4- b = 0. 
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Hence there are three cases (Sec. 2.2): 


Case 

Type of Roots 

General Solution 

i 

Distinct real A x , A 2 

y = c x e^ x + c 2 e* zX 

n 

Double -\a 

y = (c x + c 2 x)e~ axI2 

ra 

Complex ~\a ± ico* 

y = cos (o*x + B sin a>*x) 


Important applications of (5) in mechanical and electrical engineering in connection 
with vibrations and resonance are discussed in Secs. 2.4, 2.7, and 2.8. 

Another large class of ODEs solvable “algebraically” consists of the 
Euler-Cauchy equations 

(7) x 2 y" + axy' + by = 0 (Sec. 2.5). 

These have solutions of the form y = x m , where mi is a solution of the auxiliary 
equation 

(8) m 2 + (a — 1 )m + b — 0. 

Existence and uniqueness of solutions of (1) and (2) is discussed in Secs. 2.6 
and 2.7, and reduction of order in Sec. 2.1. 




CHAPTER 3 


Higher Order Linear ODEs 


In this chapter we extend the concepts and methods of Chap. 2 for linear ODEs from order 
n = 2 to arbitrary order n. This will be straightforward and needs no new ideas. However, 
the formulas become more involved, the variety of roots of the characteristic equation (in 
Sec. 3.2) becomes much larger with increasing n, and the Wronskian plays a more 
prominent role. 

Prerequisite : Secs. 2.1, 2.2, 2.6, 2.7, 2.10. 

References and Answers to Problems: App. 1 Part A, and App. 2. 

3.1 Homogeneous Linear ODEs 

Recall from Sec. 1.1 that an ODE is of nth order if the nth derivative y (n) = d n y/dx n of 
the unknown function y(x) is the highest occurring derivative. Thus the ODE is of the form 

F(x, y, /,■■■, y (n> ) = 0 

where lower order derivatives and y itself may or may not occur. Such an ODE is called 
linear if it can be written 

(1) V (rt) + p n -i(x)y in -" + ■■■+ Pl (x)y' + p 0 (x)y = r(x). 

(For n = 2 this is (1) in Sec. 2.1 with Pl = p and Po = q). The coefficients p 0 , • • • , Pn -i 
and the function r on the right are any given functions of x, and y is unknown. y <ro> has 
coefficient I . This is practical. We call this the standard form. (If you have p n {x)y (n \ 
divide by p n {x) to get this form.) An nth-order ODE that cannot be written in the form 

(1) is called nonlinear. 

If r(x) is identically zero, r(x) = 0 (zero for all x considered, usually in some open 
interval /), then (1) becomes 

(2) y <n) + p n _ 1 (*)/ n - 1> + • • • + Pl {x)y' + Po (x)y = 0 

and is called homogeneous. If ;*(.*) is not identically zero, then the ODE is called 
nonhomogeneous. This is as in Sec. 2.1. 

A solution of an Hth-order (linear or nonlinear) ODE on some open interval / is a 
function y = h(x) that is defined and n times differentiable on / and is such that the ODE 
becomes an identity if we replace the unknown function y and its derivatives by h and its 
corresponding derivatives. 
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Homogeneous Linear ODE: Superposition Principle, 
General Solution 

Sections 3. 1-3.2 will be devoted to homogeneous linear ODEs and Sec. 3.3 to 
nonhomogeneous linear ODEs. The basic superposition or linearity principle in Sec. 2.1 
extends to nth order homogeneous linear ODEs as follows. 


THEOREM 1 


Fundamental Theorem for the Homogeneous Linear ODE (2) 

For a homogeneous linear ODE (2), sums and constant multiples of solutions on 
some open internal 1 are again solutions on /. (This does not hold for a 
nonhomogeneous or nonlinear ODE!) 


The proof is a simple generalization of that in Sec. 2.1 and we leave it to the student. 

Our further discussion parallels and extends that for second-order ODEs in Sec. 2.1. 
So we define next a general solution of (2), which will require an extension of linear 
independence from 2 to n functions. 


DEFINITION 


General Solution, Basis, Particular Solution 

A general solution of (2) on an open interval / is a solution of (2) on I of the form 
(3) >•(.*) = c^Cv) + • • • + c n y n (x) (c 1( • • • , c n arbitrary) 

where y l9 • • • , y n is a basis (or fundamental system) of solutions of (2) on I; that 
is, these solutions are linearly independent on A as defined below. 

A particular solution of (2) on 1 is obtained if we assign specific values to the 
n constants c x , ■ • • , c n in (3). 


DEFINITION 


Linear Independence and Dependence 

n functions .Vitx*), • • * , y n ( x) are called linearly independent on some interval I 
where they are defined if the equation 

(4) k-xy^x) + • • • + Jc^yjx) = 0 on I 

implies that all k x , • • • , k n are zero. These functions are called linearly dependent 
on / if this equation also holds on I for some k x , ■ ■ ■ , k n not all zero. 


(As in Secs. 1.1 and 2.1, the arbitrary constants c lt • • • , c n must sometimes be restricted 
to some interval.) 

If and only if y x , • • • , ,v„ are linearly dependent on /, we can express (at least) one of 
these functions on / as a “linear combination” of the other n - 1 functions, that is, as 
a sum of those functions, each multiplied by a constant (zero or not). This motivates the 
term “linearly dependent.” For instance, if (4) holds with k x * 0, we can divide by k x and 
express )'i as the linear combination 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


THEOREM 2 


1 

Vi = - -^-(* 2)’2 + • • • + k n y n ). 

Note that when n = 2, these concepts reduce to those defined in Sec. 2.1. 

Linear Dependence 

Show that the functions y x = .v 2 , y 2 — 5.v, y 3 = 2v are linearly dependent on any interval. 

Solution . y 2 = Oyi 4- 2.5y 3 . This proves linear dependence on any interval. ■ 

Linear Independence 

Show that yi = x. y 2 = x 2 , y 3 = jc 3 are linearly independent on any interval, for instance, on — 1 ^ x • ^ 2. 
Solution . Equation (4) is k^x -1- k 2 x 2 + k 3 .x 3 = 0. Taking (a) x = — 1. (b) x = 1, (c) .v = 2, we get 

(a) — ki 4 A '2 — & 3 = 0, (b) ki + k 2 4* ^3 — 0, (c) 2^ + 4A* 2 4- S k 3 = 0. 

A '2 = 0 from (a) -I- (b). Then k 3 — 0 from (c) —2(b). Then k x = 0 from (b). This proves linear independence. 
A better method for testing linear independence of solutions of ODEs will soon be explained. ■ 

General Solution. Basis 

Solve the fourth-order ODE 

v iv - 5y" + 4y = 0 (where y ,v = d\ldx\ 

Solution ♦ As in Sec. 2.2 we try and substitute y = e Xx . Omitting the common factor we obtain the 
characteristic equation 

A 4 - 5A 2 4- 4 = 0. 

This is a quadratic equation in fx = A 2 , namely, 

fj? - 5/X + 4 = {fJL - 1 )(/a - 4) = 0. 

The roots are /x = 1 and 4. Hence A = -2, -1,1, 2. This gives four solutions. A general solution on any 
interval is 

y = c ie - 2 * + 4- c 3 e x + c^ 2 * 

provided those four solutions are linearly independent. This is true but will be shown later. H 

Initial Value Problem. Existence and Uniqueness 

An initial value problem for the ODE (2) consists of (2) and n initial conditions 

(5) y(x 0 ) = K 0 , y'(x o) = • • • , / n_1, (x 0 ) = K n - X 

with given a* 0 in the open interval 7 considered, and given K 0 , • • • , AT n _ x . 

In extension of the existence and uniqueness theorem in Sec. 2.6 we now have the following. 

Existence and Uniqueness Theorem for Initial Value Problems 

If the coefficients Po(x), • • • , p n -i(x) of { 2) are continuous on some open interval I 
and *o is in /, then the initial value problem (2), (5) has a unique solution y(x) on L 


Existence is proved in Ref. [All] in App. 1. Uniqueness can be proved by a slight 
generalization of the uniqueness proof at the beginning of App. 4. 
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EXAMPLE 4 


THEOREM 3 


Initial Value Problem for a Third-Order Euler-Cauchy Equation 

Solve the following initial value problem on any open interval / on the positive .v-axis containing x = 1. 

a 3 /" - 3a- 2 / + 6a/ - 6 y = 0, .v(l) = 2, y'(l) = 1. y"(l) = -4. 

Solution . Step 1. General solution . As in Sec. 2.5 we try y = x m . By differentiation and substitution, 

m(m - \)(m - 2)a* w - 3 /?i(«i - l).v m -1- 6 /ma” 1 - 6a m = 0. 

Dropping x m and ordering gives m 3 — 6m 2 + 1 bn - 6 = 0. If we can guess the root m = I, we can divide 
by m — 1 and find the other roots 2 and 3, thus obtaining the solutions x r a 2 , a 3 , which are linearly independent 
on / (see Example 2). [In general one shall need a root-finding method, such as Newton’s (Sec. 19.2), also 
available in a CAS (Computer Algebra System).] Hence a general solution is 

v = Ci a + c 2 a 2 + c 3 a 3 

valid on any interval /, even when it includes x = 0 where the coefficients of the ODE divided by a 3 (to have 
the standard form) are not continuous. 

Step 2. Particular solution. The derivatives are / = c x + 2c 2 x + 3c^v 2 and y" = 2c 2 *f 6 c 3 a. From this and 
y and the initial conditions we get by setting x = I 

(a) y( 1) = Ci + c 2 + c's = 2 

(b) /(l) = c x + 2c- 2 + 3c 3 = 1 

(c) /'(l) = 2 c 2 + 6c 3 = -4. 

This is solved by Cramer’s rule (Sec. 7.6), or by elimination, which is simple, as follows, (b) - (a) gives 
(d) c 2 + 2c 3 = -1. Then (c) - 2(d) gives c 3 = — 1. Then (c) gives c 2 - 1. Finally c x = 2 from (a). 
Answer: y — 2x -1- a 2 - a 3 . H 


Linear Independence of Solutions. Wronskian 

Linear independence of solutions is crucial for obtaining general solutions. Although it 
can often be seen by inspection, it would be good to have a criterion for it. Now Theorem 
2 in Sec. 2.6 extends from order n = 2 to any n. This extended criterion uses the Wronskian 
W of 72 solutions y 1? • • • , y n defined as the nth order determinant 


( 6 ) 


W( yi ,---, yn ) = 


y i 

/ 

.Vi 


.V2 

/ 

^2 


.Vn 


v (n— 1) v (n— 1) 
) 7 1 } 7 2 


y?- 1 ' 


Note that W depends on x since y l9 • • * , y n does. The criterion states that these solutions 
form a basis if and only if W is not zero; more precisely: 


Linear Dependence and Independence of Solutions 

Let the ODE (2) have continuous coefficients p 0 (x), ■ * * , p n -\{x) on an open 
interval I. Then n solutions y x , • • • , y n of (2) on 1 are linearly dependent on l if 
and only if their Wronskian is zero for some x =x 0 in I. Furthermore, ifW is zero for 
x = Xq, then W is identically zero on /. Hence if there is an x x in I at which W is 
not zero , then y l9 • • • , y n are linearly independent on /, so that they form a basis 
of solutions of (2) on /. 
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PROOF (a) Let y l9 • • • , y n be linearly dependent solutions of (2) on /. Then, by definition, there 
are constants k l9 • • • , k n not all zero , such that for all a in /, 

(7) *1>’1 + • • * + k n y n = 0. 

By n - 1 differentiations of (7) we obtain for all x in / 


( 8 ) 


k\y\ 


k n} f n 


= 0 


*l.Vi n X> + • ’ • + knyn V = °* 


(7), (8) is a homogeneous linear system of algebraic equations with a nontrivial solution 
k l9 • * • , k n . Hence its coefficient determinant must be zero for every a* on /, by Cramer’s 
theorem (Sec. 7.7). But that determinant is the Wronskian W, as we see from (6). Hence 
W is zero for every x on /. 

(b) Conversely, if W is zero at an a 0 in /, then the system (7), (8) with x = a 0 has a solution 
&i*, • • * , k n * 9 not all zero, by the same theorem. With these constants we define the 
solution y* = k x *y x + • • ■ + k n *y n of (2) on /. By (7), (8) this solution satisfies the 
initial conditions y% x 0 ) = 0, • • • , y* (n_1> (A 0 ) = 0. But another solution satisfying the 
same conditions is y = 0. Hence y* = y by Theorem 2, which applies since the coefficients 
of (2) are continuous. Together, y* = k 1 ^y 1 +■••-{- k n * y n = 0 on I. This means linear 
dependence of y x , • • • , y n on I. 

(c) If W is zero at an a 0 in /, we have linear dependence by (b) and then W = 0 by (a). 

Hence if W is not zero at an x 1 in /, the solutions y x , • - * , y n must be linearly independent 
on /. ■ 


EXAMPLE 5 Basis, Wronskian 


We can now prove that in Example 3 we do have a basis. In evaluating W, pull out the exponential functions 
columnwise. In the result, subtract Column 1 from Columns 2, 3, 4 (without changing Column 1). Then 
expand by Row 1 . In the resulting third-order determinant, subtract Column I from Column 2 and expand 
the result by Row 2: 


e~ 2x 

«-* 


e 2 * 


1 1 

l 

1 

-26-** 


e x 

2e 2x 


-2 -l 

1 

2 

4e~ 2x 

r x 

e* 

4c 2 * 


4 1 

1 

4 

1 

00 

a 

1 

If 

-<r* 

e x 

8c 2 * 


-8 -I 

1 

8 


1 3 

-3 -3 

7 9 


4 

0 

16 


= 72. ■ 


A General Solution of (2) Includes All Solutions 

Let us first show that general solutions always exist. Indeed, Theorem 3 in Sec. 2.6 extends 
as follows. 


THEOREM 4 


Existence of a General Solution 

If the coefficients p 0 ( a), • • • , p n _i(A) of (2) are continuous on some open interval 
4 then (2) has a general solution on 1. 




no 


CHAP. 3 Higher Order Linear ODEs 


PROOF We choose any fixed x Q in /. By Theorem 2 the ODE (2) has n solutions y ls • • • , y n , 
where yj satisfies initial conditions (5) with K J _ 1 = 1 and all other K's equal to zero. Their 
Wronskian at x 0 equals 1. For instance, when n = 3, then yi(x 0 ) = 1, y^x 0 ) = 1, 
„V 3 (*o) — 1 1 and the other initial values are zero. Thus, as claimed. 



y 1 (^ 0 ) 

y 2 Uo) 

.V 3 U 0 ) 


1 

0 

0 

WO’iUo), y 2 C*o)> .v 3 Uo)) = 

yi(x 0 ) 

yz(xo) 

V. 3 (^o) 

= 

0 

1 

0 


.viVo) 

y 2 ,( x o) 

V3CV0) 


0 

0 

1 


Hence for any n those solutions y l9 • • • , y n are linearly independent on /, by Theorem 3. 
They form a basis on /, and y = Cx.Vj 4- • • • 4- c n .y M is a general solution of (2) on /. ■ 

We can now prove the basic property that from a general solution of (2) every solution 
of (2) can be obtained by choosing suitable values of the arbitrary constants. Hence an 
nth order linear ODE has no singular solutions, that is, solutions that cannot be obtained 
from a general solution. 


THEOREM 5 


General Solution Includes All Solutions 

If the ODE (2) has continuous coefficients p 0 (x), * • , A?.-iW on some open interval 
I, then every solution y = Y(x) of (2) on / is of the form 

(9) Y(x) = CjViCv) + • • • + C n y n (x) 

where y lt * • ■ , y n is a basis of solutions of (2) on 1 and C 1( * • * , C n are suitable 
constants . 


PROOF Let Y be a given solution and y = c 'i)'i + • • • + c n y n a general solution of (2) on I. We 
choose any fixed a 0 in / and show that we can find constants c l9 • • *, c n for which y and 
its first n — 1 derivatives agree with Y and its corresponding derivatives at a* 0 . That is, 
we should have at x = x 0 


c iYi 

4- • 

• * 4- 

Vn 

= Y 

CiYi 

4- • 

• • 4- 

OnYn 

= Y' 

~ v <n- 1) 
c l)l 

4- • 

• • 4- 

(n— 1) 
n 

— y(n— 1) 


But this is a linear system of equations in the unknowns c 1? ••• , c n . Its coefficient 
determinant is the Wronskian W of y l9 • • • , y n at x Q . Since y l9 * • • , y n form a basis, they 
are linearly independent, so that W is not zero by Theorem 3. Hence (10) has a unique 
solution ci = C x , • * • , c n = C n (by Cramer’s theorem in Sec. 7.7). With these values 
we obtain the particular solution 

)’*(*) = + • • • + C tl y n (x) 

on I. Equation (10) shows that y* and its first n - I derivatives agree at x 0 with Y and 
its corresponding derivatives. That is, y * and Y satisfy at at 0 the same initial conditions. 
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The uniqueness theorem (Theorem 2) now implies that y* = Y on 7. This proves the 
theorem. ■ 

This completes our theory of the homogeneous linear ODE (2). Note that for n — 2 it is 
identical with that in Sec. 2.6. This had to be expected. 


|U5| TYPICAL EXAMPLES OF BASES 

To get a feel for higher order ODEs, show that the given 
functions are solutions and form a basis on any interval. 
Use Wronskians. (In Prob. 2, x > 0.) 

1. 1, A, A' 2 , A 3 , y iv = 0 

2. 1 , a 2 , a 4 , aV"-3a/ + 3/ =0 

3. e x , xe x , x 2 e x , y m — 3 y ,f + 3y ; — y = 0 

4. e 2x cos a, e 2x sin a, e~ 2x cos a, e~ 2x sin a, 
y iv - 6y” 4 25y = 0. 

5. 1, a, cos 3 a, sin 3 a, y lv 4 9 y" = 0 

6. TEAM PROJECT. General Properties of Solutions 
of Linear ODEs. These properties are important in 
obtaining new solutions from given ones. Therefore 
extend Team Project 34 in Sec. 2.2 to zzth-order ODEs. 
Explore statements on sums and multiples of solutions 
of (1) and (2) systematically and with proofs. 
Recognize clearly that no new ideas are needed in this 
extension from n = 2 to general ;z. 

|7-I9| LINEAR INDEPENDENCE 
AND DEPENDENCE 

Are the given functions linearly independent or dependent 
on the positive A-axis? (Give a reason.) 

7. 1, e x , e~ x 8. a + 1, a + 2, a 

9. In a. In a 2 , (In a ) 2 10. e x , e~ x , sinh 2 a 


11. 

A 2 , A'|a| 

, x 

12. 

A. 1/a, 0 

13. 

sin 2a, 

sin a, cos a 

14. 

cos 2 a, sin 2 a, cos 2a 

15. 

tan a, col a, 1 

16. 

(a - l) 2 , (a 4- l) 2 , x 

17. 

sin a, sin \x 

18. 

cosh a, sinh a, cosh 2 a 

19. 

cos 2 A, 

sin 2 a, 27 r 



20. 

TEAM 

PROJECT. 

Linear Independence and 


Dependence, (a) Investigate the given question about 
a set S of functions on an interval /. Give an example. 
Prove your answer. 

( 1 ) If S contains the zero function, can S be linearly 
independent? 

(2) If S is linearly independent on a subinterval J of 7, 
is it linearly independent on /? 

(3) If 5 is linearly dependent on a subinterval J of 7, 
is it linearly dependent on 7? 

(4) If S is linearly independent on 7, is it linearly 
independent on a subinterval P 

(5) If S is linearly dependent on 7, is it linearly 
independent on a subinterval P 

(6) If S is linearly dependent on 7, and if T contains 5, 
is T linearly dependent on 7? 

(b) In what cases can you use the Wronskian for 
testing linear independence? By what other means can 
you perform such a test? 


3.2 Homogeneous Linear ODEs with Constant 
Coefficients 


In this section we consider nth-order homogeneous linear ODEs with constant coefficients, 
which we write in the form 

(1) y (n) + o n _iy (n “ 1) + • • • + + a 0 y = 0 

where y Cn) = d n y/dx n , etc. We shall see that this extends the case n — 2 discussed in 
Sec. 2.2. Substituting y = e Kx (as in Sec. 2.2), we obtain the characteristic equation 


( 2 ) 


A" + A Cn - X) + • • • + a x \ + a 0 = 0 
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EXAMPLE 1 


of (1). If A is a root of (2), then y = e Xx is a solution of (1). To find these roots, you may 
need a numeric method, such as Newton’s in Sec. 19.2, also available on the usual CASs. 
For general n there are more cases than for n = 2. We shall discuss all of them and 
illustrate them with typical examples. 

Distinct Real Roots 

If all the n roots A ls • • - , A^ of (2) are real and different, then the n solutions 


constitute a basis for all x. The corresponding general solution of (1) is 
(4) y = c x e + • • • + c n e . 

Indeed, the solutions in (3) are linearly independent, as we shall see after the example. 

Distinct Real Roots 

Solve the ODE y m - 2 y" - y + 2y = 0. 

Solution . The characteristic equation is A 3 - 2A 2 — A + 2 = 0. It has the roots —1, 1, 2; if you find one 
of them by inspection, you can obtain the other two roots by solving a quadratic equation (explain!). The 
corresponding general solution (4) is y = c t e~ x + c 2 e x + c 3 e 2x . B 

Linear Independence of (3). Students familiar with nth-order determinants may verify 
that by pulling out all exponential functions from the columns and denoting their product 
by E , thus E — exp [(A a + • * • + A h )a], the Wronskian of the solutions in (3) becomes 


(5) 



e A,x 

e X2X 



A l( ? AlX 

\ z e X2X 

A n e A ” x 

w = 

A x V lX 

\ 2 2 e X2X 

A„V ,,X 


A?-V'* 

AJ-V** 

\%~ 1 e* nX 


= E 


1 

Ai 

Ax 2 


1 

^2 

a 2 2 


i 

A„ 

An 2 . 


71-1 


A? 


A 


71-1 

2 


K 


i 


The exponential function E is never zero. Hence W— 0 if and only if the determinant on 
the right is zero. This is a so-called Vandermonde or Cauchy determinant 1 . It can be 
shown that it equals 


'ALEXANDRE THEOPHILE VANDERMONDE (1735-1796), French mathematician, who worked on 
solution of equations by determinants. For CAUCHY see footnote 4, in Sec. 2.5. 
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THEOREM 1 


THEOREM 2 


EXAMPLE 2 


(6) (_!)«<«— D/2y 

where V is the product of all factors Aj — A k with j < £ (= «); for instance, when n = 3 
we get — V = —(A! - A 2 )(A X — A 3 )(A 2 - A 3 ). This shows that the Wronskian is not zero 
if and only if all the n roots of (2) are different and thus gives the following. 


Basis 

Solutions _Vi = e 1 , • • • , y n = e *' of { 1) (i with any real or complex A/s) form a 
basis of solutions of ( 1 ) on any open intewal if and only if all n roots of (2) are 
different. 


Actually, Theorem 1 is an important special case of our more general result obtained 
from (5) and (6): 


Linear Independence 

Any number of solutions of ( 1) of the form e^ x are linearly independent on an open 
interval I if and only if the corresponding A are all different . 


Simple Complex Roots 

If complex roots occur, they must occur in conjugate pairs since the coefficients of (I) 
are real. Thus, if A = y -F ia> is a simple root of (2), so is the conjugate A = y — ia>, and 
two corresponding linearly independent solutions are (as in Sec. 2.2, except for notation) 

Vj = e yx cos cox , y 2 = e yx sin cox. 

Simple Complex Roots. Initial Value Problem 

Solve the initial value problem 

y'" - y" + 100 y - lOOv = 0, v(0) = 4, y'(0) =11, y"(0) = -299. 

Solution . The characteristic equation is A 3 - A 2 + 100A - 100 = 0. It has the root 1. as can perhaps be 
seen by inspection. Then division by A — 1 shows that the other roots are ± 10/. Hence a general solution and 
its derivatives (obtained by differentiation) are 

y = c x e x + A cos 10.v + B sin I0.v. 
y = c x e x — 104 sin 10.v + 105 cos 1 0.v. 
y" = c r e x - 1004 cos 10 a- - 1005 sin 10 a\ 

From this and the initial conditions we obtain by setting x = 0 

(a) t'i + A = 4, (b) ^+105=11, (c) c x - 1004 = -299. 

We solve this system for the unknowns 4, 5. cj. Equation (a) minus Equation (c) gives 1014 = 303, 4 = 3. 
Then = l from (a) and 5 = 1 from (b). The solution is (Fig. 72) 

y = e x + 3 cos 10.v + sin 10.x*. 


This gives the solution curve, which oscillates about e x (dashed in Fig. 72 on p. 114). 
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EXAMPLE 3 



Fig. 72. Solution in Example 2 


Multiple Real Roots 

If a real double root occurs, say, A x = A 2 , then y x = y 2 in (3), and we take y x and xy ± as 
corresponding linearly independent solutions. This is as in Sec. 2.2. 

More generally, if A is a real root of order m, then m corresponding linearly independent 
solutions are 

(7) xe Xx , x?e* x , • • • , * m “V v . 

We derive these solutions after the next example and indicate how to prove their linear 
independence. 

Real Double and Triple Roots 

Solve the ODE / - 3/ v + 3 y'" - v" = 0. 

Solution . The characteristic equation A 5 - 3A 4 + 3A 3 — A 2 = 0 has the roots A x = A 2 = 0 and 
A 3 = A 4 = A 5 = 1, and the answer is 

(8) y = ci + c 2 x + (c 3 + c 4 .v + c 5 x 2 )e x . ■ 

Derivation of (7). We write the left side of (1) as 

L[y] = y (n) + i3 ,(n—1) + • • • + *oy- 

Let y = e Xx . Then by performing the differentiations we have 

L[e Xx ] = (A n + a n ~i\ n ~ 1 + • • • + a 0 )e Xx . 

Now let Aj be a root of mth order of the polynomial on the right, where m ^ n . For 
in < n let A, n+1 , • * • , A n be the other roots, all different from X v Writing the polynomial 
in product form, we then have 


L[e Xx ] = (A - A 1 ) m 7?(A)e Aa ‘ 

with h( A) = 1 if m = n 7 and /i(A) = (A - Ki+i) • • • (A — A„) if m < n. Now comes the 
key idea: We differentiate on both sides with respect to A, 

4 - L[e**] = m(A - + (A - A,)” 1 [/i(A)e Aa: ]. 


(9) 
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The differentiations with respect to a: and A are independent and the occurring derivatives 
are continuous, so that we can interchange their order on the left: 

-k •“] - 

The right side of (9) is zero for A = A x because of the factors A — A x (and m ^ 2 since 
we have a multiple root!). Hence L[xe X}X ] = 0 by (9) and (10). This proves that xe AlX is 
a solution of ( 1 ). 

We can repeat this step and produce x 2 e l ’\ • • • , x m ” 1 e l * T by another m — 2 such 
differentiations with respect to A. Going one step further would no longer give zero on 
the right because the lowest power of A — A x would then be (A — A x )°, multiplied by 
m\h( A) and /?(A X ) ^ 0 because h{ A) has no factors A — A x ; so we get precisely the solutions 
in (7). 

We finally show that the solutions (7) are linearly independent. For a specific n 
this can be seen by calculating their Wronskian, which turns out to be nonzero. For 
arbitrary m we can pull out the exponential functions from the Wronskian. This gives 
(e Xx ) vl = e Amx times a determinant which by “row operations” can be reduced to the 
Wronskian of 1. x, • • • , a™” 1 . The latter is constant and different from zero (equal to 
1 !2! • * • (m — 1)!). These functions are solutions of the ODE y (m) = 0, so that linear 
independence follows from Theroem 3 in Sec. 3.1. ■ 

Multiple Complex Roots 

In this case, real solutions are obtained as for complex simplejoots above. Consequently, 
if A = y + i(o is a complex double root, so is the conjugate A = y — ico. Corresponding 
linearly independent solutions are 

(11) e v* cos co. r, e yx sin cox y xe ,ya? cos cox , xe ** sin au\ 

The first two of these result from e Xx and e Kx as before, and the second two from xe Xx 
and xe Xx in the same fashion. Obviously, the corresponding general solution is 

(12) y = e yx [(A 1 4 A 2 x) cos cox 4 (B 1 4 B 2 x) sin cox]. 

For complex triple roots (which hardly ever occur in applications), one would obtain 
two more solutions x 2 e Xx cos cox , x 2 e yX ' sin cox , and so on. 


GO) 


- ri - 1 




|~M>1 ode for given basis 

Find an ODE (1) for which the given functions form a basis 
of solutions. 

1. <r x , e 2 *, tf 3 * 2. jctf- 2 *, jr 2 *- 2 * 

3. e x , e~ x , cos x , sin x 

4. cos a*, sin a, a* cos a, a sin a 

5. 1 , a, cos 2a, sin 2a 

6. er 2 *, <r*\ <r T . e 2r , 1 


7-12 


GENERAL SOLUTION 


Solve the given ODE. (Show the details of your work.) 

7 tn , t r\ 

. y + y = o 
8. y v - 29 y" + 100^ = 0 
. y + y ~ y - y = 0 


10. 16y iv - 8y" + y = 0 

11. y m - 3 y" - Ay 4 6y = 0 

12. y iv + 3y" - Ay = 0 
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13-18 


INITIAL VALUE PROBLEMS 


Solve by a CAS, giving a general solution and the particular 

solution and its graph. 

13. y iv + 0.45/" - 0.165/' + 0.0045/ - 0.00 175y = 0, 
v(0) = 17.4, y'(0) = -2.82, y"(0) = 2.0485. 
y"'(0) = -1.458675 

14. Ay"' + 8y" + 4ly' + 37y = 0. y(0) = 9. 
y'(0) = -6.5. y"(0) = -39.75 

15. y'" + 3.2y" + 4.8 lv' = 0, y(0) = 3.4, 
y'(0) = -4.6, y"(0) = 9.91 

16. v iv + 4v = 0. v(0) = i v'(0) = -§. y"(0) = |, 
y"'(0) = -1 

17. v iv - 9y" - 400y = 0, v(0) = 0. y'(0) = 0, 
y"(0) = 41, y"'(0) = 0 

18. y"' + 7.5y" + 14.25y' - 9.l25v = 0. 
y(0) = 10.05, y'(0) = -54.975. 
y"(0) = 257.5125 


19. CAS PROJECT. Wronskians. Euler-Cauchy 
Equations of Higher Order. Although Euler-Cauchy 
equations have variable coefficients (powers of a), we 
include them here because they fit quite well into the 
present methods. 

(a) Write a program for calculating Wronskians. 

(b) Apply the program to some bases of third-order 
and fourth-order constant-coefficient ODEs. Compare 


the results with those obtained by the program most 
likely available for Wronskians in your CAS. 

(c) Extend the solution method in Sec. 2.5 to any order 
n. Solve aV" + 2a 2 v” - 4 aV + 4v = 0 and another 
ODE of your choice. In each case calculate the 
Wronskian. 

20. PROJECT. Reduction of Order. This is of practical 
interest since a single solution of an ODE can often be 
guessed. For second order, see Example 7 in Sec. 2.1. 

(a) How could you reduce the order of a linear 
constant-coefficient ODE if a solution is known? 

(b) Extend the method to a variable-coefficient ODE 

v"' + p&x)y M + P\(x)y + p 0 (x)y = 0. 

Assuming a solution y x to be known, show that another 
solution is y 2 (x) = u(x)yi(x) with u(.x) = / z(a) clx and 
c obtained by solving 

)‘\Z + (3y{ + P 2 )'i)z + (3y" + 2p 2 y[ + po'i)z = 0. 

(c) Reduce 

X s y m — 3x 2 y" + (6 — .v 2 )at / — (6 — x 2 )y — 0, 

using y x = .v (perhaps obtainable by inspection). 

21. CAS EXPERIMENT. Reduction of Order. Starting 
with a basis, find third-order ODEs with variable 
coefficients for which the reduction to second order 
turns out to be relatively simple. 


3.3 Nonhomogeneous Linear ODEs 

We now turn from homogeneous to nonhomogeneous linear ODEs of / 2 th order. We write 
them in standard form 

(1) .V (n) + Pn-i(A').v (n - 1) + • • • + p 1 (x)y f + Po (x)y = r( x) 

with y (n) = d n y/dx n as the first term, which is practical, and r(x) ^ 0. As for second-order 
ODEs, a general solution of (1) on an open interval / of the .v-axis is of the form 

(2) y(x) = y h ( x) + v p (a). 

Here .y/,(x) = c 1 v 1 (a') -h • • • 4- c n y n (A) is a general solution of the corresponding 
homogeneous ODE 

(3) y (n) + Ai_i(A)/ n-1) + • • • + Pi(x)y' + p 0 (x)y = 0 

on I. Also, y p is any solution of (1) on / containing no arbitrary constants. If (1) has 
continuous coefficients and a continuous /*(*) on /, then a general solution of (1) exists 
and includes all solutions. Thus (1) has no singular solutions. 
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EXAMPLE 1 


An initial value problem for (1) consists of (1) and n initial conditions 
(4) y(x 0 ) = K 0 , y'(x 0 ) = K ly •••, 

with A' 0 in /. Under those continuity assumptions it has a unique solution. The ideas of 
proof are the same as those for w = 2 in Sec. 2.7. 


Method of Undetermined Coefficients 

Equation (2) shows that for solving (I) we have to determine a particular solution of (1). 
For a constant-coefficient equation 

(5) y in) + + • • • + a x y' 4- a 0 y = r(x) 

(i a 0 , • • • , a n ^ x constant) and special /*(*) as in Sec. 2.7, such a y p {x) can be determined 
by the method of undetermined coefficients, as in Sec. 2.7, using the following rules. 


(A) Basic Rule as in Sec. 2.7. 

(B) Modification Rule. If a term in your choice for y p (x) is a solution of the 
homogeneous equation (3), then multiply y p {x) byx k , where k is the smallest positive 
integer such that no term of x k y p (x) is a solution of ( 3). 

(C) Sum Rule as in Sec. 2.7. 


The practical application of the method is the same as that in Sec. 2.7. It suffices to 
illustrate the typical steps of solving an initial value problem and, in particular, the new 
Modification Rule, which includes the old Modification Rule as a particular case (with 
k = 1 or 2). We shall see that the technicalities are the same as for n = 2, perhaps except 
for the more involved determination of the constants. 

Initial Value Problem. Modification Rule 

Solve the initial value problem 

(6) /" + 3/ + 3/ + y = 30 e - *, y(0) = 3, v'(0) = -3. ,v"(0) = -47. 

Solution. Step 1. The characteristic equation is A 3 + 3A 2 + 3A + l = (A + I) 3 = 0. It has the triple root 
A = — 1 . Hence a general solution of the homogeneous ODE is 

y h = c 1 e~ x + c 2 xe~ x + c 3 

~ (<T + C 2 X + c 3 xZ )e~ x . 

Step 2. If we try y p = Ce~ x , we get - C + 3C — 3C + C = 30, which has no solution. Try Cite - * and Cx Z e~ x . 
The Modification Rule calls for 


y p = Cx 3 e~ x . 

y p = C(3x 2 - x 3 )e~ x . 

y'p = C( 6x - 6a- 2 + x 3 )e~ x 

y'p= C(6 - 18a- + 9a 2 - x 3 )e~ x . 


Then 
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Substitution of these expressions into (6) and omission of the common factor e~ x gives 

C(6 - l&v + 9.v 2 - ,v 3 ) + 3C(6.v - 6.v 2 + .v 3 ) + 3C(3.v 2 - .v 3 ) + C.v 3 = 30. 

The linear, quadratic, and cubic terms drop out, and 6C = 30. Hence C = 5. This gives y p = 5x 3 e~ x . 

Step 3. We now write down v = v;, + y p , the general solution of the given ODE. From it we find c x by the 
first initial condition. We insert the value, differentiate, and determine c 2 from the second initial condition, insert 
the value, and finally determine c 3 from v w (0) and the third initial condition: 

v = \j r + y p = (c x + c 2 a + + 5x 3 e~ x , y(0) = c x = 3 

v' = [-3 + c 2 + (~c 2 + 2 c 3 )x + (15 - c 3 )x 2 - 5.r s ]*“* y'(0) = -3 + c 2 = -3, c 2 = 0 

v" = [3 + 2c 3 + (30 - 4c 3 ).v + (-30 + c 3 )x 2 + 5.v 3 ]e“ x , y"(0) = 3 + 2c 3 = -47, c 3 = -25. 

Hence the answer to our problem is (Fig. 73) 

y = (3 - 25x z )e~ x + 5.v 3 c~ r . 

The curve of y begins at (0, 3) with a negative slope, as expected from the initial values, and approaches zero 
as .v— > sc. The dashed curve in Fig. 73 is y p . B 



Fig. 73. y and y p (dashed) in Example 1 


Method of Variation of Parameters 

The method of variation of parameters (see Sec. 2.10) also extends to arbitrary order n. 
It gives a particular solution y p for the nonhomogeneous equation (l) (in standard form 
with y (n) as the first term!) by the formula 


(7) 


n 


y P (x) = 2 >'k(x) 

k = 1 


r w k (x) 
j W(x) 


r(x) dx 



Wj/M 
W( x) 


r(x) dx + • • • + 



W n (x) 

W(x) 


r(x) dx 


on an open interval 7 on which the coefficients of ( 1 ) and r(x) are continuous. In (7) the 
functions .y lt • • • , v n form a basis of the homogeneous ODE (3), with Wronskian W, and 
Wj (j = !,•••, n) is obtained from IV by replacing the yth column of IV by the column 
[0 0 • • • 0 1] T . Thus, when n = 2, this becomes identical with (2) in Sec. 2.10, 



y i 

>2 


0 

.V2 



yi 

0 

w = 

/ 

/ 

w 1 = 

1 

t 

= “.v 2 . 

1V 2 = 

/ 



Vi 

.v 2 


>2 



.Vx 

1 


The proof of (7) uses an extension of the idea of the proof of (2) in Sec. 2.10 and can 
be found in Ref [All] listed in App. 1. 
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EXAMPLE 


Variation of Parameters. Nonhomogeneous Euler-Cauchy Equation 

Solve the nonhomogeneous Euler-Cauchy equation 

a y* - 3 a 2 / + 6xy' - 6y = x 4 In a- (a > 0). 


Solution. Step 1. General solution of the homogeneous ODE. Substitution of_y = x m and the derivatives 
into the homogeneous ODE and deletion of the factor x m gives 

m(m — I )(m — 2) — 3 m(m — I) -I- 6m — 6 = 0. 

The roots are 1 . 2, 3 and give as a basis 

)’l = -V, )>2 = A 2 , y 3 = A 3 . 

Hence the corresponding general solution of the homogeneous ODE is 


)'h = ClX + C 2 A- 2 + C 3 A 3 . 


Step 2. Determinants needed in (7). These are 




W, = 


w 9 = 


w 3 = 


a 

2x 

2 

.v 2 

2x 

2 

0 

0 

1 

.v 2 

2x 

2 


3a 2 

6a 

x 3 

3a 2 

6a 

x 3 

3a* 2 

6a 

0 

0 

1 


= 2a- 3 


= -2 a 3 


= A 2 . 


Step 3. Integration. In (7) we also need the right side / (a) of our ODE in standard form, obtained by division 
of the given equation by the coefficient a 3 of y M ; thus, r( x) = (a 4 In a)/a 3 = a In a. In (7) we have the simple 
quotients W X IW = a/2, W 2 /W = - 1, W 3 IW = 1/(2y). Hence (7) becomes 


>’p = - v 


I 


x 

— x In a dx 


A 2 Jx In A dx + A 3 J ^ 


a In a dx 


X 

2 



2 


(a In a — a). 


Simplification gives y p = gA 4 (In a — ^). Hence the answer is 


y = >'h + y P = + c 2 x 2 + C 3 x s + ^ a 4 (in a - •£). 

Figure 74 shows y p . Can you explain the shape of this curve? Its behavior near a = 0? The occurrence of 
a minimum? Its rapid increase? Why would the method of undetermined coefficients not have given the 
solution? m 
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EXAMPLE 3 



Fig. 74. Particular solution y p of the nonhomogeneous 
Euler-Cauchy equation in Example 2 


Application: Elastic Beams 

Whereas second-order ODEs have various applications, some of the more important ones 
we have seen, higher order ODEs occur much more rarely in engineering work. An 
important fourth-order ODE governs the bending of elastic beams, such as wooden or iron 
girders in a building or a bridge. 

Vibrations of beams will be considered in Sec. 12.3. 

Bending of an Elastic Beam under a Load 

We consider a beam B of length L and constant (e.g.. rectangular) cross section and homogeneous elastic 
material (e.g., steel); see Fig. 75. We assume that under its own weight the beam is bent so little that it is 
practically straight. If we apply a load to B in a vertical plane through the axis of symmetry (the .v-axis in 
Fig. 75), B is bent. Its axis is curved into the so-called elastic curve C (or deflection curve). It is shown in 
elasticity theory that the bending moment M{x) is proportional to the curvature &(.v) of C. We assume the bending 
to be small, so that the deflection y{x) and its derivative y\x) (determining the tangent direction of C) are small. 
Then, by calculus, k = y"/(l + y' 2 ) 3/2 y". Hence 


Mix) = £//(*>. 


El is the constant of proportionality. E is Young's modulus of elasticity of the material of the beam. I is the 
moment of inertia of the cross section about the (horizontal) z-axis in Fig. 75. 

Elasticity theory shows further that A/"(.v) = /(.v), where f(x) is the load per unit length. Together, 

(8) Elv iv = f(x). 




Fig. 75. Elastic Beam 
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The practically most important supports and corresponding boundary conditions are as follows (see Fig. 76). 

(A) Simply supported y = y" = 0 at x = 0 and L 

(B) Clamped at both ends y = y =0 at a* = 0 and L 

(C) Clamped at a: = 0, free at a = L y(0) = /( 0) = 0, v"(L) = y'"(L) = 0. 

Tlie boundary condition y = 0 means no displacement at that point, y = 0 means a horizontal tangent, y" = 0 
means no bending moment, and y m = 0 means no shear force. 

Let us apply this to the uniformly loaded simply supported beam in Fig. 75. The load is /( a) = /o “ const . 
Then (8) is 


(9) 


v iv = k. 


k= I° 

El ' 


This can be solved simply by calculus. Two integrations give 


y “ 2 A + c l x + c 2- 

y ;, (0) = 0 gives c 2 = 0. Then y"(L) = L{\kL + c x ) = 0, c x = —kLf 2 (since L ^ 0). Hence 


(A 2 - Lx). 


Integrating this twice, we obtain 


y = i (iy v4 -f * > + e ** + c 4) 


with c 4 = 0 from y(0) = 0. Then 

kL / 1? 1? \ 

• V(L) = T (72 _ T + C3 J = °’ C3 = 

Inserting the expression for k, we obtain as our solution 

/0 ',.4 _ „ v 3 , 


12 ' 




2 Lx* -I- r x). 


Since the boundary conditions at both ends are the same, we expect the deflection y(x) to be “symmetric” with 
respect to L/2, that is, >’(a) = y(L - a). Verify this directly or set x = it + L! 2 and show that y becomes an 
even function of w. 


fo 

24 El 




From this we can see that the maximum deflection in the middle at « = 0 (a = LIT) is 5f 0 L 4 l( 16 * 24E/). Recall 
that the positive direction points downward. M 



(A) Simply supported 



(B) Clamped at both 
ends 


x — 0 


(C) Clamped at the left 
end, free at the 
x — L right end 

Fig. 76. Supports of a Beam 
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|~l— 8] GENERAL SOLUTION 

Solve the following ODEs. (Show the details of your work.) 

1. /" - 2y" - Ay' + 8y = «r 3 * + 8.v 2 

2. y + 3 y" — 5y' - 39 y = 30 cos.v 

3. y' v + 0.5V" + 0.0625y = e~ x cos 0.5.v 

4. y'" + 2 y" - 5 y - 6 y = 100<? -3 ' T + 18c-* 

5. x 3 y"' + 0.75aV - 0.75.V = 9.v 5 5 

6. (xD 3 + 4 D 2 )v = Se x 

7. (D 4 + 10D 2 + 9/)y = 13 cosh 2x 

8. (D 3 - 2D 2 - 9D + 1 8/)y = e 2x 


9-14 


INITIAL VALUE PROBLEMS 


Solve the following initial value problems. (Show the 
details.) 


9. y'" - 9y" + 27y' - 27y = 54 sin 3.v, y(0) = 3.5, 
y'(0) = 13.5, y"(0) = 38.5 

10. y iv - I6y = 128 cosh 2x, y(0) = 1, y'(0) = 24, 

y"(0) = 20, y"'(0) = -160 

11. (x 3 D 3 - a 2 D 2 - IxD + 1 6/)y = 9x ln.v, 
y( I ) = 6. Dy( 1 ) = 18, D 2 y(l) = 65 

12. (D 4 - 26 D 2 + 25 J)y = 50(.v + l) 2 , y(0) = 12.16, 

Dy(0) = -6. D 2 y(0) = 34, D 3 y(0) = -130 


13. (D 3 + AD 2 + 85D)y = 135.re x , y(0) = 10.4, 

Dy(0) = -18.1. D 2 y(0) = -691.6 

14. (2D 3 - D 2 - 8D + 4/)y = sin x, y(0) = 1 . 
Dy(0) = 0, D 2 y(0) = 6 

15. WRITING PROJECT. Comparison of Methods. 
Write a report on the method of undetermined coefficients 
and die mediod of variation of parameters, discussing and 
comparing the advantages and disadvantages of each 
method. Illustrate your findings with typical examples. 
Try to show that the method of undetermined coefficients, 
say, for a third-order ODE with constant coefficients and 
an exponential function on the right, can be derived from 
the method of variation of parameters. 

16. CAS EXPERIMENT. Undetermined Coefficients. 
Since variation of parameters is generally complicated, 
it seems worthwhile to try to extend the other method. 
Find out experimentally for what ODEs this is possible 
and for what not. Hint: Work backward, solving ODEs 
with a CAS and then looking whether the solution 
could be obtained by undetermined coefficients. For 
example, consider 

y'" - 12v" + 48y' - 64y = x ,/2 <? 4x and 
x 3 y"' + x 2 y" - 6 xy' +6 y = x In x. 


CHAPTER 3 REVIEW QUESTIONS AND PROBLEMS 


1. What is the superposition or linearity principle? For 
what nth-order ODEs does it hold ? 

2. List some other basic theorems that extend from 
second-order to nth-order ODEs. 

3. If you know a general solution of a homogeneous linear 
ODE, what do you need to obtain from it a general 
solution of a corresponding nonhomogeneous linear 
ODE? 

4. What is an initial value problem for an nth-order linear 
ODE? 

5. What is the Wronskian? What is it used for? 
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GENERAL SOLUTION 


Solve the given ODE. (Show the details of your work.) 

6. y 4- 6y n + 1 8v f -F 40.y = 0 

7. 4a- V" + 12*/' + 3y' = 0 

8. y iv + 10/' + 9 y = 0 

9. 8y"' + 12y" - 2 y* - 3y = 0 
10. (D 3 + 3D 2 -f 3D + !)y = jr 2 


11. (aD 4 4- D 3 )y = 150.V 4 

12. (£> 4 - 2D 3 - 8D 2 )y = 16 cos 2x 

13. (D 3 + l)y = 9e x/2 

14. (x 3 D 3 - 3x 2 D 2 + 6 xD ~ 6/)y = 30 a- 2 

15 . (£> 3 — D 2 — D + l)y = <?* 
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INITIAL VALUE PROBLEMS 


Solve the given problem. (Show the details.) 


16. y"' - 2y" + 4y' - 8y = 0, y(0) = -1, 

y'(0) = 30, y"(0) = 28 

17. x 3 y m 4- lx 2 y n - 2 xy - lOy = 0, y(l) = 1, 

.v'O) = -7, y"(I) = 44 

18. (Z) 3 4- 25D)y = 32 cos 2 4a, y(0) = 0, 

Dy(0) = 0. D 2 y(0) = 0 


19. (D 4 + 40D 2 - 44 1 / )y = 8 cosh a, y(0) = 1.98, 

Dy(0) = 3, D 2 y(0) = -40.02, D 3 y(0) = 27 

20. (a 3 D 3 + 5x 2 D 2 4- 2a D - 21 )y = 7a 3/2 , 

y ( 1 ) = 10.6, Dy( I ) = -3.6, D 2 y( I) = 31.2 






Summary of Chapter 3 
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“SUMMARY OFCHAPTEr3 

Higher Order Linear ODEs 


Compare with the similar Summary of Chap . 2 (the case n = 2). 

Chapter 3 extends Chap. 2 from order n = 2 to arbitrary order n. An nth-order 
linear ODE is an ODE that can be written 

(1) ,v (rt) + Pn- i(-v).v <n_1> + • • • + Pi(x)y' + Po(x)y = r(x) 

with y <n) = d n y/dx n as the first term; we again call this the standard form. Equation 

(1) is called homogeneous if r(x) = 0 on a given open interval 7 considered, 
nonhomogeneous if r(x) ^ 0 on 7. For the homogeneous ODE 

(2) y n) + p n -i(x)/ n ~ u + • • • + Pl (x)y' + po(x)y = 0 

the superposition principle (Sec. 3.1) holds, just as in the case n = 2. A basis or 
fundamental system of solutions of (2) on 7 consists of n linearly independent 
solutions y l9 • • • , y n of (2) on 7. A general solution of (2) on / is a linear combination 
of these, 

(3) y = + • * • 4- c n y n (c v • • * , c n arbitrary constants). 

A general solution of the nonhomogeneous ODE (1) on 7 is of the form 

(4) y = y h + y p (Sec. 3.3). 

Here, y p is a particular solution of (1) and is obtained by two methods 
(undetermined coefficients or variation of parameters) explained in Sec. 3.3. 

An initial value problem for (I) or (2) consists of one of these ODEs and n 
initial conditions (Secs. 3.1, 3.3) 

(5) y( A* 0 ) = K 0i y'(x 0 ) = K x , • • • , y (n “ 1) ( x 0 ) = K n _ x 

with given a* 0 in 7 and given K 0 , • • • , K n _ L . If p 0 , • • • , /? n _ r are continuous on 
7, then general solutions of (I) and (2) on 7 exist, and initial value problems (I), 

(5) or (2), (5) have a unique solution. 




CHAPTER 4 


Systems of ODEs. Phase Plane. 
Qualitative Methods 


Systems of ODEs have various applications (see, for instance. Secs. 4.1 and 4.5). Their 
theory is outlined in Sec. 4.2 and includes that of a single ODE. The practically important 
conversion of a single nth-order ODE to a system is shown in Sec. 4.1. 

Linear systems (Secs. 4.3, 4.4, 4.6) are best treated by the use of vectors and matrices, 
of which, however, only a few elementary facts will be needed here, as given in Sec. 4.0 
and probably familiar to most students. 

Qualitative methods. In addition to actually solving systems (Sec. 4.3, 4.6), which is 
often difficult or even impossible, we shall explain a totally different method, namely, the 
powerful method of investigating the general behavior of whole families of solutions in 
the phase plane (Sec. 4.3). This approach to systems of ODEs is called a qualitative 
method because it does not need actual solutions (in contrats to a " quantitative method ” 
of actually solving a system). 

This phase plane method [ as it is called, also gives information on stability of solutions, 
which is of general importance in control theory, circuit theory, population dynamics, and 
so on. Here, stability of a physical system means that, roughly speaking, a small change 
at some instant causes only small changes in the behavior of the system at all later times. 

Phase plane methods can be extended to nonlinear systems, for which they are 
particularly useful. We will show this in Sec. 4.5, which includes a discussion of the 
pendulum equation and the Lotka-Volterra population model. We finally discuss 
nonhomogeneous linear systems in Sec. 4.6. 

NOTATION. Analogous to Chaps. 1-3, we continue to denote unknown functions by 
y; thus, y^t), y 2 (t). This seems preferable to suddenly using x for functions, jcx(0, x 2 (t), 
as is sometimes done in systems of ODEs. 

Prerequisite: Chap. 2. 

References and Answers to Problems: App. 1 Part A, and App. 2. 


4.0 Basics of Matrices and Vectors 

In discussing linear systems of ODEs we shall use matrices and vectors. This simplifies 
formulas and clarifies ideas. But we shall need only a few elementary facts (by no means 
the bulk of material in Chaps. 7 and 8). These facts will very likely be at the disposal of 
most students. Hence this section is for reference only. Begin with Sec. 4.1 and consult 
4.0 as needed. 
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Most of our linear systems will consist of two ODEs in two unknown functions y^f), 

y'l = 0u) , i + a 12 y 2 , y'l = “5 y x + 2y 2 

(1) for example, 

y*2 = a 2l)’l + a 22) j 2^ ) ; 2 = + 2^2 

(perhaps with additional given functions gi(t), g 2 (t) in the two ODEs on the right). 

Similarly, a linear system of n first-order ODEs in n unknown functions y x (/), 
y n {i) is of the form 

y'l = tfiD'i + ^12^2 + * * * + “myn 

yL = “2^1 + ^22.V2 + • • • + 02 n.Vn 

( 2 ) 


?n = 0nl) ; l + 0n23 ; 2 + ‘ * ’ + 0nrt)'n 

(perhaps with an additional given function in each ODE on the right). 

Some Definitions and Terms 

Matrices. In (1) the (constant or variable) coefficients form a 2 x 2 matrix A, that is, 
an array 


(3) A = [a jfc ] = 


for example. 


a n a \2 

_ a 2l a 22_ 

Similarly, the coefficients in (2) form an n x n matrix 


A = 


-5 

13 




(4) 


A = [aj fc ] = 


“011 

012 

01n 

021 

022 

02 n 

_0nl 

0« 2 

a nn, 


The an, a 12 , • • • are called entries, the horizontal lines rows, and the vertical lines 
columns. Thus, in (3) the first row is [a n a 12 ], the second row is [a 21 a 2 2 ], and the 
first and second columns are 


022 J 

In the “double subscript notation” for entries, the first subscript denotes the row and the 
second the column in which the entry stands. Similarly in (4). The main diagonal is the 
diagonal a n a 22 • • • a nn in (4), hence a n a 22 in (3). 

We shall need only square matrices, that is, matrices with the same number of rows 
and columns, as in (3) and (4). 


0n 
.02 1 
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Vectors. A column vector x with n components x x , * * * , x n is of the form 


-*r 

*^2 


thus if n = 2, 


x = 




%2, 


Similarly, a row vector v is of the form 


v = [v 1 • ■ • v n ], thus if n = 2, then v = [u x , u 2 ]- 


Calculations with Matrices and Vectors 

Equality. Two n X n matrices are equal if and only if corresponding entries are equal. 
Thus for n = 2, let 



**11 

**12 



'bn 

^12 

A = 

_**21 

**22 _ 

and 

B = 

_^21 

&22_ 


Then A = B if and only if 

**u = ^ 11 » **12 — ^12 

**21 = ^ 21 * **22 = ^ 22 - 

Two column vectors (or two row vectors) are equal if and only if they both have n 
components and corresponding components are equal. Thus, let 




A'l 


”1 = *1 


V = 


and x = 


Then v = x if and only if 


V 2 


*2. 


U 2 = A' 2 . 


Addition is performed by adding corresponding entries (or components); here, matrices 
must both be n X /z, and vectors must both have the same number of components. Thus 
for n = 2, 


(5) 


“<in + b n 

**12 “h ^12 

"»i + *l" 


» 

V + X = 

_«21 + b 21 

**22 “1“ ^22_ 

v 2 + x 2 


Scalar multiplication (multiplication by a number c) is performed by multiplying each 
entry (or component) by c. For example, if 



' 9 

3' 



“-63 

-21“ 

A = 

-2 

> 

0 _ 

then 

— 7A = 

14 

0 _ 


“0.4“ 



“ 4 ' 

v = 

_— 13_ 

, then 

10v = 

_— 130_ 
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Matrix Multiplication. The product C = AB (in this order) of two n X n matrices 
A = [cij k ] and B = [b jk ] is the n X n matrix C = [c jk ] with entries 


( 6 ) 


n 

c jk 2 a jmPmk 

m=l 


j = 1> ’ • • . n 

k = 1 , • • • , n. 


that is, multiply each entry in the Jth row of A by the corresponding entry in the kth column 
of B and then add these n products. One says briefly that this is a “multiplication of rows 
into columns.” For example. 


9 

3 ~ 

_ 1 — 4 “ 


9- 1 + 3-2 

9 • (- 4 ) + 3 - 5 " 

2 

0 . 

_2 5 _ 


2 • 1 + 0-2 

(- 2 )* (- 4 ) + 0 * 5 . 


15 —21 
2 8 . ’ 


CAUTION! Matrix multiplication is not commutative , AB =£ BA in general. In our 
example. 


_ 1 -4 

9 

3 " 

" 1-9 + (- 4 )* (- 2 ) 

1 • 3 + (- 4 ) • 0 " 

_2 5 _ 

.-2 

0. 

_2 ■ 9 + 5 • (- 2 ) 

2 • 3 + 5 • 0 . 



Multiplication of an n X n matrix A by a vector x with n components is defined by the 
same rule: v = Ax is the vector with the n components 

n 

Vj — ^2 ^jm^rn j If " ' " » W. 

m=l 

For example. 


‘ 12 

7" 


v 


12 ^ + 1x2 

—8 

3 _ 


_* 2 _ 


8 ^ + 3 a : 2 _ 


Systems of ODEs as Vector Equations 

Differentiation. The derivative of a matrix (or vector) with variable entries (or 
components) is obtained by differentiating each entry (or component). Thus, if 


y(0 = 


~yi(0~ 


~e~ 2t ~ 

.yz(0_ 


_sin t_ 


then 


~y i(0" 


1 

8 

1 

CN 

1 

_>&)_ 


cost J 


Using matrix multiplication and differentiation, we can now write (1) as 


“ /“ 
yi 

= Ay = 

’flu 

a 12 


V 


'-5 

2" 


ft" 






. e.g., y' = 




LyaJ 


.<*21 

a 22_ 




. 13 

1 

2 J 


_V2_ 


(7) y' = 
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Similarly for (2) by means of an n X n matrix A and a column vector y with n components, 
namely, y' = Ay. The vector equation (7) is equivalent to two equations for the 
components, and these are precisely the two ODEs in (1). 


Some Further Operations and Terms 

Transposition is the operation of writing columns as rows and conversely and is indicated 
by T. Thus the transpose A T of the 2 X 2 matrix 



’<*11 

<*12’ 


'-5 

2“ 


’<*11 

<*21 


'-5 

13” 

A = 

_<*21 

"22_ 


_ 13 

1 

2 J 

is A T = 

_<*12 

<*22 _ 


2 

I 

2j 


The transpose of a column vector, say, 


"l 


is a row vector. 


v' = pi v 2 ], 


and conversely. 

Inverse of a Matrix. The n X n unit matrix I is the n X n matrix with main diagonal 
1, 1, • • • , 1 and all other entries zero. If for a given n X n matrix A there is an n X n 
matrix B such that AB = BA = I, then A is called nonsingular and B is called the inverse 
of A and is denoted by A" 1 ; thus 

(8) AA" 1 = A _1 A = I. 


If A has no inverse, it is called singular. For n = 2, 


(9) 


A" 1 


1 <*22 a l2 


det A 


L "21 


"n 


where the determinant of A is 


( 10 ) 


det A = 


"ii 

<*21 


"l2 


Cl 22 


” <* 11<*22 <* 12 <* 21 - 


(For general n, see Sec. 7.7, but this will not be needed in this chapter.) 

Linear Independence, r given vectors v a> , • • • , v <r) with n components are called a 
linearly independent set or, more briefly, linearly independent, if 

(11) c 1 v (1) + • • • + c r v (r) = 0 


implies that all scalars c x , • • • , c r must be zero; here, 0 denotes the zero vector, whose 
n components are all zero. If (1 1) also holds for scalars not all zero (so that at least 
one of these scalars is not zero), then these vectors are called a linearly dependent set 
or, briefly, linearly dependent, because then at least one of them can be expressed as 
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a linear combination of the others; that is, if, for instance, c x # 0 in (11), then we 
can obtain 

v (1> = — — (c 2 v (2) 4 • • • 4- c r v Cr) )- 

<*i 


Eigenvalues, Eigenvectors 

Eigenvalues and eigenvectors will be very important in this chapter (and, as a matter of 
fact, throughout mathematics). 

Let A = [cij k ] be an n X n matrix. Consider the equation 

(12) Ax = Ax 

where A is a scalar (a real or complex number) to be determined and x is a vector to be 
determined. Now for every A a solution is x = 0. A scalar A such that (12) holds for some 
vector x ^ 0 is called an eigenvalue of A, and this vector is called an eigenvector of A 
corresponding to this eigenvalue A. 

We can write (12) as Ax — Ax = 0 or 

(13) (A — AI)x = 0. 


These are n linear algebraic equations in the n unknowns x v • • • , (the components of 
x). For these equations to have a solution x + 0, the determinant of the coefficient matrix 
A — AI must be zero. This is proved as a basic fact in linear algebra (Theorem 4 in 
Sec. 7.7). In this chapter we need this only for n = 2. Then (13) is 


(14) 

in components, 
(14*) 


'a u - A 

012 


~Xl~ 


"0" 

021 

022 — A_ 


_*2_ 


_ 0 _ 


(0n - A)*! 4 a 12 x 2 = 0 

021*1 + (022 “ A)x 2 = 0. 


Now A - AI is singular if and only if its determinant det (A — AI), called the characteristic 
determinant of A (also for general /?), is zero. This gives 


det (A — AI) = 


0n “ A 


021 


012 

022 _ A 


(15) 


— (a xx A )(# 22 A) 012021 


— A 2 (%! 4 fl 22 )A 4- 0H022 ~ 012021 ~~ 0* 


This quadratic equation in A is called the characteristic equation of A. Its solutions are 
the eigenvalues A x and A 2 of A. First determine these. Then use (14*) with A = A x to 
determine an eigenvector x a) of A corresponding to A x . Finally use (14*) with A = A 2 to 
find an eigenvector x <2> of A corresponding to A 2 . Note that if x is an eigenvector of A, 
so is kx for any k =£ 0. 
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EXAMPLE 1 Eigenvalue Problem 

Find Ihe eigenvalues and eigenvectors of the matrix 



Solution. The characteristic equation is the quadratic equation 


det [A - AI] = 


-4 - A 
- 1.6 



= A 2 4* 2.8A + 1.6 = 0. 


It has the solutions A x = —2 and A 2 = —0.8. These are the eigenvalues of A. 
Eigenvectors are obtained from (14*). For A = \i = -2 we have from (14*) 


(-4.0 + 2.0) Ax + 4.0.r 2 = 0 

-I.6a*j + (1.2 + 2.0)a- 2 = 0. 


A solution of the First equation is A| = 2, x 2 = 1 . This also satisfies the second equation. (Why?). Hence an 
eigenvector of A corresponding to A L = -2.0 is 


(17) 



Similarly, 



is an eigenvector of A corresponding to A 2 = -0.8, as obtained from (14*) with A — A 2 - Verify this. ■ 


4.1 Systems of ODEs as Models 


We first illustrate with a few typical examples that systems of ODEs can serve as models 
in various applications. We further show that a higher order ODE (with the highest 
derivative standing alone on one side) can be reduced to a first-order system. Both facts 
account for the practical importance of these systems. 


EXAMPLE 1 Mixing Problem Involving Two Tanks 

A mixing problem involving a single tank is modeled by a single ODE, and you may first review the 
corresponding Example 3 in Sec. 1.3 because the principle of modeling will be the same for two tanks. The 
model will be a system of two first-order ODEs. 

Tank 7\ and T 2 in Fig. 77 contain initially 100 gal of water each. In T\ the water is pure, whereas 150 lb of 
fertilizer are dissolved in T 2 . By circulating liquid at a rate of 2 gal/min and stirring (to keep the mixture uniform) 
the amounts of fertilizer y^/) in 7^ and y 2 (t) in T 2 change with time t. How long should we let the liquid circulate 
so that 7*! will contain at least half as much fertilizer as there will be left in 7 2 ? 

Solution . Step 1 . Setting up the model. As for a single tank, the lime rate of change y[(t) of yi(i) equals 
inflow minus outflow. Similarly for tank 7" 2 . From Fig. 77 we see that 


y[ = Inflow/min - Outflow/min = — r— y 2 — 

1 00 ’ 

, 2 
y 2 = Inflow/min - Outflow/min = —tv i - 

1UU 



100 


y r 2 


(Tank T x ) 


(Tank T 2 ). 


Hence the mathematical model of our mixture problem is the system of first-order ODEs 


y{ = —0.02 y t + 0.02v 2 
y 2 = 0.02yi - 0.02 y 2 


(Tank T x ) 
(Tank T 2 ). 
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As a vector equation with column vector y 



and matrix A this becomes 


y' = Ay, 


where 



0 . 02 " 

— 0 . 02 _ * 


Step 2. General solution. As for a single equation, we try an exponential function of t> 

(1) y = xe Ar . Then y' = Axc a ' = Axe A/ . 

Dividing the last equation Axe A/ = Axc a ' by e A/ and interchanging the left and right sides, we obtain 

Ax = Ax. 

We need nontrivial solutions (solutions that are not identically zero). Hence we have to look for eigenvalues 
and eigenvectors of A. The eigenvalues are the solutions of the characteristic equation 


( 2 ) 


del (A - AI) = 


-0.02 - A 

0.02 


0.02 

-0.02 - A 


= (-0.02 - A) 2 - 0.02 2 = A(A + 0.04) = 0. 


We see that = 0 (which can very well happen — don’t get mixed up — it is eigenvectors that must not be zero) 
and A 2 = -0.04. Eigenvectors are obtained from (14*) in Sec. 4.0 with A = 0 and A = -0.04. For our present 
A this gives [we need only the first equation in (14*)] 

-0.02a-! + 0.02a- 2 = 0 and (-0.02 + 0.04)*! 4- 0.02x 2 = 0, 

respectively. Hence x x = x 2 and x x = — a 2 , respectively, and we can take x x = a 2 = 1 and x x = -x 2 = l. 
This gives two eigenvectors corresponding to \ x = 0 and A 2 = -0.04, respectively, namely, 

and x <2 > = [_'J . 



From (1) and the superposition principle (which continues to hold for systems of homogeneous linear ODEs) 
we thus obtain a solution 

(3) y = + c 2 x ( V 2 ‘ = c, ^ 

where c x and c 2 are arbitrary constants. Later we shall call this a general solution. 

Step 3 . Use of initial conditions. The initial conditions are y 2 (0) = 0 (no fertilizer in tank T x ) and y 2 (0) = 150. 
From this and (3) with / = 0 we obtain 

y(0) = cj [ 


1 c x + c 2 0 

C 2 = = 

— 1_ ci - Co _150_ 
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EXAMPLE 2 


In components this is + c 2 - 0, c x - c 2 - 150. The solution is c x 

‘T 


y = 75x (1) - 75x (2) <T 0 04t = 75 


- 75 


= 75. c 2 — -75. This gives the answer 


In components. 


yi = 75 - 75e~ 0 04t (Tank 7\, lower curve) 

y 2 = 75 + 75<? -0 04t (Tank T 2 . upper curve). 

Figure 77 shows the exponential increase of y\ and the exponential decrease of y 2 to the common limit 75 lb. 
Did you expect this for physical reasons? Can you physically explain why the curves look “symmetric”? Would 
the limit change if T x initially contained 100 lb of fertilizer and T 2 contained 50 lb? 

Step 4. Answer. T x contains half the fertilizer amount of T 2 if it contains 1/3 of the total amount, that is, 
50 lb. Thus 


y x = 15- 15e-° Mt = 50, e~ 004t = t = (In 3)/0.04 = 27.5. 

Hence the fluid should circulate for at least about half an hour. H 

Electrical Network 

Find the currents / x (t) and l 2 {t) in the network in Fig. 78. Assume all currents and charges to be zero at f = 0, 
the instant when the switch is closed. 


L = 1 henry C = 0.25 farad 



Fig. 78. Electrical network In Example 2 


Solution . Step 1. Setting up the mathematical model. The model of this network is obtained from 
Kirchhoff $ voltage law, as in Sec. 2.9 (where we considered single circuits). Let I x {t) and l 2 (t) be the currents 
in the left and right loops, respectively. In the left loop the voltage drops are Ll[ = /J [V] over the inductor 
and R\(I\ — I 2 ) = 4 (I x - I 2 ) [V] over the resistor, the difference because ! x and / 2 flow through the resistor 
in opposite directions. By Kirchhoff s voltage law the sum of these drops equals the voltage of the battery; that 
is, /{ + 4 {l 1 — l 2 ) = 12, hence 

(4a) /J = -41 x + 4 1 2 + 12. 

In the right loop the voltage drops are R 2 I 2 = 6 / 2 IV] and R\(f 2 — l x ) = 4(/ 2 - I x ) IV] over the resistors and 
( 1/C)/ 1 2 dt = 4 / 1 2 dt [V] over the capacitor, and their sum is zero, 

6/ 2 + 4(/ 2 - 7j) + 4 J / 2 dt = 0 or IO/ 2 -4^+4 j l 2 dr = 0. 

Division by 10 and differentiation gives / 2 - 0.4/J + 0.4/ 2 = 0. 

To simplify the solution process, we first get rid of 0.4/j, which by (4a) equals 0.4(-4/ 1 + 4/ 2 + 12). 
Substitution into the present ODE gives 


/ 2 = 0.4/{ - 0.4 / 2 = 0.4(-4/ 1 + 4/ 2 + 12) - 0.4/ 2 
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and by simplification 

(4b) 1 2 = -I.6J1 + 1.2/ 2 + 4.8. 

In matrix form, (4) is (we write J since I is the unit matrix) 

f pii r-4.0 

(5) J = AJ + g, where J = , A = 

Ia J L-i* 

Step 2. Solving (5). Because of the vector g this is a nonhomogeneous system, and we try to proceed as For 
a single ODE, solving first the homogeneous system j' = AJ (thus j' - AJ = 0) by substituting J = xe At . 
This gives 

J ; = Axc A/ = Axc a ', hence Ax = Ax. 



Hence to obtain a nontrivial solution, we again need the eigenvalues and eigenvectors. For the present matrix 
A they are derived in Example I in Sec. 4.0: 

*■— ■ * a ’-“ ’■"■[.'J- 

Hence a “general solution'’ of the homogeneous system is 

Jfc = ClX a) f -2f + c 2 x <2> c- a8( . 

For a particular solution of the nonhomogeneous system (5). since g is constant, we try a constant column vector 
J p = a with components <i,. a 2 . Then Jp = 0, and substitution into (5) gives Aa + g = 0; in components. 

— 4.0tfx + 4.0 a 2 + 12,0 = 0 
— 1.6a, + 1 2a 2 + 4.8 = 0. 

The solution is a x = 3. a 2 = 0: thus a = . Hence 

(6) J = J, t + Jp = c x x ay e- 2t + c 2 x <2) c-° m + a; 
in components, 

I\ — 2 c x e~ 2tr 4- c 2 e °' 8t 4- 3 
l 2 = c x e~ 2t + 0.8c 2 <T°- 8t . 

The initial conditions give 

/i(0) = 2c, + c 2 + 3=0 

* 2 (0) = ti + 0.8c 2 = 0. 

Hence c, = -4 and c 2 = 5. As the solution of our problem we thus obtain 

(7) J = — 4x (1) <T 2f + 5x <2) <T°- 8t -1* a. 

In components (Fig. 79b). 

/, = — 8e“ 2t + 5e-° st + 3 
/ 2 = — 4c“ 2t + 4c“ a8e . 


Now comes an important idea, on which we shall elaborate further, beginning in Sec. 4.3. Figure 79a shows 
/;,(/) and / 2 (/> as two separate curves. Figure 79b shows these two currents as a single curve [^(f), l 2 (0] in the 
/,/ 2 -plane. This is a parametric representation with time t as the parameter. It is often important to know in 
which sense such a curve is traced. This can be indicated by an arrow in the sense of increasing f, as is shown. 
The /,y 2 -plane is called the phase plane of our system (5). and the curve in Fig. 79b is called a trsyectory. We 
shall see that such “phase plane representations” are far more important than graphs as in Fig. 79a because 
they will give a much better qualitative overall impression of the general behavior of whole families of solutions, 
not merely of one solution as in the present case. ■ 
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(b) Trajectory [7,(0. / 2 (f)f 
in the /^-plane 
(the "phase plane") 


Fig. 79. Currents in Example 2 


Conversion of an nth-Order ODE to a System 

We show that an rcth-order ODE of the general form (8) (see Theorem 1) can be converted 
to a system of n first-order ODEs. This is practically and theoretically important — 
practically because it permits the study and solution of single ODEs by methods for 
systems, and theoretically because it opens a way of including the theory of higher order 
ODEs into that of first-order systems. This conversion is another reason for the importance 
of systems, in addition to their use as models in various basic applications. The idea of 
the conversion is simple and straightforward, as follows. 


THEOREM 1 


Conversion of an ODE 

An nth-order ODE 

(8) y™ = F{t,y,y', 

can be converted to a system of n first-order ODEs by setting 

(9) j’i = y, y 2 = y', )’3 = y", * • • . y n = / n-1> - 


This system is of the form 


( 10 ) 


y[ = j >2 
y'z = Js 


yn — 1 yn 


yh = Hu ;vi, y n l 


PROOF The first 72 — 1 of these n ODEs follow immediately from (9) by differentiation. Also, 
yn = y Cn) by (9), so that the last equation in (10) results from the given ODE (8). ■ 
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EXAMPLE 3 Mass on a Spring 

To gain confidence in the conversion method, let us apply it to an old friend of ours, modeling the free motions 
of a mass on a spring (see Sec. 2.4) 

r/ , ; , y // C f k 

my + cy + ky = 0 or y = y — — y. 

For this ODE (8) the system (10) is linear and homogeneous, 


y l =^2 


y 2 = )’i }' 2 - 

m m 


Setting y = 


- P 1 " 

J2A 


we get in matrix form 

y' = Ay = 


k_ 

m 


The characteristic equation is 

det (A - AI) = 



■>r 


^2. 


-A 1 

- — — — - A 

in m 


= A 2 + — A + — = 0. 


It agrees with that in Sec. 2.4. For an illustrative computation, let m = 1, c = 2, and k = 0.75. Then 
A 2 + 2 A + 0.75 = (A + 0.5)(A + 1.5) = 0. 


This gives the eigenvalues A x = -0.5 and A 2 = -1.5, Eigenvectors follow from the first equation in 
A — AI = 0, which is -Xx 1 + .v 2 = 0. For ^ this gives 0.5^ + x 2 = 0, say, x\ = 2, x 2 = — 1. For A 2 = — 1.5 
it gives 1.5 a'x + x 2 = 0, say. x 1 = 1, x 2 = —1.5. These eigenvectors 

This vector solution has the first component 

_ ““ 0.5t i — 1.5t 

y — y\ - + c 2 e 

which is the expected solution. The second component is its derivative 

)2 - yi - y - c i e - i.5 c 2 e m 


siffi 




MIXING PROBLEMS 

1. Find out without calculation whether doubling the flow 
rate in Example 1 has the same effect as halfing the 
tank sizes. (Give a reason.) 

2. What happens in Example 1 if we replace T 2 by a tank 
containing 500 gal of water and 150 lb of fertilizer 
dissolved in it? 


3. Derive the eigenvectors in Example 1 without 
consulting this book. 

4. In Example 1 find a “general solution” for any ratio 
a = (flow rate) /(tank size), tank sizes being equal. 
Comment on the result. 

5. If you extend Example 1 by a tank T z of the same size 
as the others and connected to T 2 by two tubes with 
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flow rates as between 7\ and T* what system of ODEs 
will you get? 

6. Find a “general solution” of the system in Prob. 5. 


7-10 


ELECTRICAL NETWORKS 


7. Find the currents in Example 2 if the initial currents 
are 0 and —3 A (minus meaning that / 2 (0) flows against 
the direction of the arrow). 


8. Find the currents in Example 2 if the resistance of R x 
and R 2 is doubled (general solution only). First, guess. 

9. What are the limits of the currents in Example 2? 
Explain them in terms of physics. 

10. Find the currents in Example 2 if the capacitance is 
changed to C — 1/5.4 F (farad). 


1 J— 15 


CONVERSION TO SYSTEMS 


Find a general solution of the given ODE (a) by first 
converting it to a system, (b), as given. (Show the details 
of your work.) 

11. y" - 4 y = 0 12. y" + 2y' - 24v = 0 

13. y" - y' = 0 14. y" + 15y' + 50y = 0 

15. 64y" - 48y' - 7y = 0 


16. TEAM PROJECT. Two Masses on Springs, (a) Set 
up the model for the (undamped) system in Fig. 80. 

(b) Solve the system of ODEs obtained. Hint. Try 
y = \e wl and set <o 2 = A. Proceed as in Example 1 or 2. 

(c) Describe the influence of initial conditions on the 
possible kind of motions. 



System in 
static 

equilibrium 



System in 
motion 


Fig. 80. Mechanical system in Team Project 16 


4.1 Basic Theory of Systems of ODEs 

In this section we discuss some basic concepts and facts about systems of ODEs that are 
quite similar to those for single ODEs. 

The first-order systems in the last section were special cases of the more general system 

y'i = fi(t> vi- • • • , ,v„) 
y'z = h(U y lt • • • , yj 

( 1 ) 

>'« = fn(t, )'i, • • • , >'„)• 

We can write the system (1) as a vector equation by introducing the column vectors 
y = Di * ' * y n ] T an d f = [/ 1 * * • f n ] T (where T means transposition and saves us 

the space that would be needed for writing y and f as columns). This gives 

(i) y' = f(/. y). 

This system (1) includes almost all cases of practical interest. For n = 1 it becomes 
y[ = /i(/, Vi) or, simply, y = f(t, y) 9 well known to us from Chap. 1 . 

A solution of (1) on some interval a < t < b is a set of n differentiable functions 


V] = *i to. 


}'n hn(0 
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THEOREM 1 


on a < t < b that satisfy (1) throughout this interval. In vector form, introducing the 
“solution vector ” h = [/i x • • • h n ] T (a column vector!) we can write 

y = h(0- 

An initial value problem for (1) consists of (1) and n given initial conditions 
(2) Vi(r 0 ) = Ki, } r 2^o) = ^2> * * * » yn(t o) = 

in vector form, y(t 0 ) = K, where t 0 is a specified value of t in the interval considered and 
the components of K = [K x • • • K n ] J are given numbers. Sufficient conditions for the 
existence and uniqueness of a solution of an initial value problem (1), (2) are stated in 
the following theorem, which extends the theorems in Sec. 1 .7 for a single equation. (For 
a proof, see Ref. [A7].) 


Existence and Uniqueness Theorem 

Let /i, • ■ • , f n in ( 1) be continuous functions having continuous partial derivatives 
d/i/dvi, • • • , dfifdy n , • • • , df n /dy n in some domain R of ty\y 2 * * * y n -space 
containing the point (/ 0 , K ln • • • , K n ). Then (1) has a solution on some interval 
r 0 - a < t < t Q + a satisfying (2), and this solution is unique. 


Linear Systems 

Extending the notion of a linear ODE, we call (1) a linear system if it is linear in 
yi • • • , that is, if it can be written 


y[ — a u( ! ))’i + - - • + fl in(0.V)i + gi(0 

(3) j 

y'n = «nl(0.Vl + • • • + «nn(/).V« + 8n(0- 


In vector form, this becomes 

(3) y' = Ay + g 


where A = 

’<*11 

^1 n 

. y = 

■yi’ 

. g = 

"^r 


_fl?ii 



.)n. 




This system is called homogeneous if g = 0, so that it is 
(4) y' = Ay. 


If g ^ 0, then (3) is called nonhomogeneous. The system in Example I in the last section is 
homogeneous and in Example 2 nonhomogeneous. The system in Example 3 is homogeneous. 
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THEOREM 2 


THEOREM 3 


PROOF 


For a lineal' system (3) we have df 1 /dy 1 = r/ n (/), • • • , df n /dy n = a nn (t) in Theorem 1. 
Hence for a linear system we simply obtain the following. 


Existence and Uniqueness in the Linear Case 

Let the Oj k ’s and gfs in (3) be continuous functions of t on an open interval 
a < t < ft containing the point t = t 0 . Then (3) has a solution y (t) on this internal 
satisfying (2), and this solution is unique . 


As for a single homogeneous linear ODE we have 


Superposition Principle or Linearity Principle 

If y (1) and y (2) are solutions of the homogeneous linear system (4) on some interval 
so is any linear combination y = + c 2 y (2 \ 


Differentiating and using (4), we obtain 

y' = tay" + «T 

= Ci y a) ' + cay' 21 ' 

= c x Ay U) + c 2 Ay <2) 

= ACcjy' 11 + c 2 y <2) ) = Ay. ■ 

The general theory of linear systems of ODEs is quite similar to that of a single linear 
ODE in Secs. 2.6 and 2.7. To see this, we explain the most basic concepts and facts. For 
proofs we refer to more advanced texts, such as [A7]. 

Basis. General Solution. Wronskian 

By a basis or a fundamental system of solutions of the homogeneous system (4) on some 
interval J we mean a linearly independent set of n solutions y (1) , • • • , y (n) of (4) on that 
interval. (We write J because we need I to denote the unit matrix.) We call a corresponding 
linear combination 

(5) y = Ciy a) • • • + c„y' n) (c ls • • • , c n arbitrary) 

a general solution of (4) on J. It can be shown that if the a^ft) in (4) are continuous on 
7, then (4) has a basis of solutions on 7, hence a general solution, which includes every 
solution of (4) on J. 

We can write n solutions y (1) , • • • , y (n) of (4) on some interval J as columns of an 
n X n matrix 

(6) Y = [y a) • • • y (n) ] . 
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The determinant of Y is called the Wronskian of y (1) , ••• , y (n) , written 





y? } 


W( y a> , • 

• , y (n> ) = 


y? ■ 

yW> 



y? 

.Vn 2> 

* • Vn n> 


The columns are these solutions, each in terms of components. These solutions form a 
basis on J if and only if W is not zero at any t x in this interval. W either is identically zero 
or is nowhere zero in J. (This is similar to Secs. 2.6 and 3.1.) 

If the solutions y (1) , • • • , y (n) in (5) form a basis (a fundamental system), then (6) is 
often called a fundamental matrix. Introducing a column vector c = [c x c 2 • • • c n ] T , 
we can now write (5) simply as 

(8) y = Yc. 

Furthermore, we can relate (7) to Sec. 2.6, as follows. If y and z are solutions of a 
second-order homogeneous linear ODE, their Wronskian is 

>’ Z 

W(y, z) = • 

y z 

To write this ODE as a system, we have to set y = y lt y’ = y[ = y 2 and similarly for z 
(see Sec. 4.1). But then W(y, z ) becomes (7), except for notation. 

4.: Constant-Coefficient Systems. 

Phase Plane Method 

Continuing, we now assume that our homogeneous linear system 

(1) y' = Ay 

under discussion has constant coefficients, so that the n X n matrix A = [a jfe ] has entries 
not depending on t. We want to solve (1). Now a single ODE y' = ky has the solution 
y = Ce kt . So let us try 

(2) y = xe*. 

Substitution into (1) gives y' = Axe*' = Ay = Axe*'. Dividing by e*\ we obtain the 

eigenvalue problem 


( 3 ) 


Ax = Ax. 
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Thus the nontrivial solutions of (1) (solutions that are not zero vectors) are of the form (2), 
where A is an eigenvalue of A and x is a corresponding eigenvector. 

We assume that A has a linearly independent set of n eigenvectors. This holds in most 
applications, in particular if A is symmetric ( a ^ = aj k ) or skew-symmetric ( a ^ = —aj k ) 
or has n different eigenvalues. 

Let those eigenvectors be x (1) , • • , x (n) and let them correspond to eigenvalues 
Ai, • • • , A n (which may be all different, or some — or even all — may be equal). Then the 
corresponding solutions (2) are 

(4) y (1) = x ( V‘\ • • • , y <TO) = x ( V nt . 


Their Wronskian W = W(y a) , • • • , y (n) ) [(7) in Sec. 4.2] is given by 




. . . 


. . . 
x x 

v (n) 

Ai 

w = (y a) , • • • , y (n) ) = 

4 a v' i£ 

• • • x£V" e 

A. t + • • 4*A„t 

= e 

yd) . . . 

x 2 

v (n) 

X 2 


x“>e Alt 

... xS a e V 


*-<!> . . . 

v (n) 


On the right, the exponential function is never zero, and the determinant is not zero either 
because its columns are the n linearly independent eigenvectors. This proves the following 
theorem, whose assumption is true if the matrix A is symmetric or skew-symmetric, or if 
the n eigenvalues of A are all different. 


THEOREM 1 


General Solution 

If the constant matrix A in the system ( 1 ) has a linearly independent set of n 
eigenvectors, then the corresponding solutions y (1) , • • • , y (n) in (4) form a basis of 
solutions of { 1), and the corresponding general solution is 

(5) y = c 1 x (1) ^ Alt + • • • + c n x (n) e A ”\ 


How to Graph Solutions in the Phase Plane 

We shall now concentrate on systems (1) with constant coefficients consisting of two 
ODEs 

y[ = «n;yi + a^yz 

(6) y = Ay; in components, 

y'z = “ziyi + «22)'2- 


Of course, we can graph solutions of (6), 


(7) 


r*»i 

y (/) = 
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EXAMPLE 1 


as two curves over the r-axis, one for each component of y(/). (Figure 79a in Sec. 4.1 shows 
an example.) But we can also graph (7) as a single curve in the.y^-plane. This is a parametric 
representation (parametric equation) with parameter t, (See Fig. 79b for an example. Many 
more follow. Parametric equations also occur in calculus.) Such a curve is called a trajectory 
(or sometimes an orbit or path) of (6). The y L v 2 -plane is called the phase plane. 1 If we fill 
the phase plane with trajectories of (6), we obtain the so-called phase portrait of (6). 


Trajectories in the Phase Plane (Phase Portrait) 

In order to see what is going on, let us find and graph solutions of the system 


(8) 



thus 


.'•! = -3vi + ,V2 
A = >'i - 3y 2 . 


Solution. By substituting y = xe M and y' = \xe AI and dropping the exponential function we get Ax = Ax. 
The characteristic equation is 


del (A - AI) = 


-3 - A 1 
I -3 - A 


= A 2 + 6A + 8 = 0. 


This gives the eigenvalues A* = -2 and A 2 = “4. Eigenvectors are then obtained from 


(-3 - A).V]_ + . 1*2 = 0. 

For At = -2 this is —x\ + .v 2 = 0. Hence we can take x (1) = ll I l T . For A 2 = -4 this becomes x% + a * 2 = 0, 
and an eigenvector is x (2> = [1 - ll T . This gives the general solution 


v - r vi i - 
- V 2_ 


Cl y <n + c 2 y (Z> = c x 


<- 2 ' + c 2 


■L- 

.- 1 . 


Figure 81 on p. 142 shows a phase portrait of some of the trajectories (to which more trajectories could be added 
if so desired). The two straight trajectories correspond to C\ = 0 and c 2 = 0 and the others to other choices of 
c l* c 2- ■ 


Studies of solutions in the phase plane have recently become quite important, along with 
advances in computer graphics, because a phase portrait gives a good general qualitative 
impression of the entire family of solutions. This method becomes particularly valuable 
in the frequent cases when solving an ODE or a system is inconvenient or impossible. 


Critical Points of the System (6) 

The point y = 0 in Fig. 81 seems to be a common point of all trajectories, and we want 
to explore the reason for this remarkable observation. The answer will follow by calculus. 
Indeed, from (6) we obtain 


dy 2 _ y 2 dt _ _ ^2x^1 4 * 022.V2 

dyi y[ dt y[ fln.Vi + **12X2 


l A name tliai comes from physics, where it is the y-(/w)-plane. used to plot a motion in terms of position 
v and velocity y' = v (m = mass): but the name is now used quite generally for the yi3 , 2‘P* ane - 

The use of the phase plane is a qualitative method, a method of obtaining general qualitative information 
on solutions without actually solving an ODE or a system. This method was created by HENRI POINCARE 
(1854-1 912), a great French mathematician, whose work was also fundamental in complex analysis, divergent 
series, topology, and astronomy. 



142 


CHAP. 4 Systems of ODEs. Phase Plane. Qualitative Methods 


EXAMPLE 1 


EXAMPLE 2 


This associates with every point P: (y lf y 2 ) a unique tangent direction dy 2 /dy l of the 
trajectory passing through P , except for the point P = P 0 : (0, 0). where the right side of 
(9) becomes 0/0. This point P 0 , at which dy 2 /dy x becomes undetermined, is called a critical 
point of (6). 

Five Types of Critical Points 

There are five types of critical points depending on the geometric shape of the trajectories 
near them. They are called improper nodes, proper nodes, saddle points, centers, and 
spiral points. We define and illustrate them in Examples 1-5. 

(Continued) Improper Node (Fig. 81) 

An improper node is a critical point P 0 at which all the trajectories, except for two of them, have the same 
limiting direction of the tangent. The two exceptional trajectories also have a limiting direction of the tangent 
at P 0 which, however, is different. 

The system (8) has an improper node at 0, as its phase portrait Fig, 81 shows. The common limiting direction 
at 0 is that of the eigenvector x (1) = [ 1 1] T because e~ 4t goes to zero faster than eT 21 as r increases. The two 

exceptional limiting tangent directions are those of x (2) = [1 -1] T and — x <2) = [— 1 ll T . ■ 

Proper Node (Fig. 82) 

A proper node is a critical point P 0 at which every trajectory has a definite limiting direction and for any given 
direction d at P 0 there is a trajectory having d as its limiting direction. 

The system 


(10) y' = 

Lo i 

has a proper node at the origin (see Fig. 82). Indeed, the matrix is the unit matrix. Its characteristic equation 
(1 - A) 2 = 0 has the root A = 1. Any x ^ 0 is an eigenvector, and we can take [1 0] T and [0 1] T . Hence 
a general solution is 


y, 


thus 


>'i = .Vi 

yz = yz 



Fig. 81. Trajectories of the system (8) 
(Improper node) 


Fig, 82. Trajectories of the system (10) 
(Proper node) 



SEC. 4.3 Constant-Coefficient Systems. Phase Plane Method 


143 


EXAMPLE 3 


EXAMPLE 4 


Saddle Point (Fig. 83) 

A saddle point is a critical point P 0 at which there are two incoming trajectories, two outgoing trajectories, and 
all the other trajectories in a neighborhood of P 0 bypass Pq. 

The system 


(ID 



y'l = 

>4 = -y* 


has a saddle point at the origin. Its characteristic equation (1 — A)( — I — A) = 0 has the roots A* = 1 and 
A 2 = -1. For A = 1 an eigenvector [1 0] T is obtained from the second row of (A - AI)x = 0, that is, 
Qx'i + (- 1 — \)x 2 = 0. For A 2 = — 1 the first row gives [0 1] T . Hence a general solution is 


y = 


T 

J)_ 


e l + c 2 


" 0 “ 

_ 1 _ 


yi = c i e ‘ 
y 2 = c 2 e~ 


or y^y 2 ~ const. 


This is a family of hyperbolas (and the coordinate axes); see Fig. 83. 


Center (Fig. 84) 

A center is a critical point that is enclosed by infinitely many closed trajectories. 
The system 


( 12 ) 



thus 


y[ =^2 

yk = -4yi 


has a center at the origin. The characteristic equation A 2 + 4 = 0 gives the eigenvalues 2/ and —2/. For 2/ an 
eigenvector follows from the first equation — 2/jc x + x 2 = 0 of (A — AI)x = 0, say. [1 2 #] T . For A = -2/ that 

equation is — (— 2i)xi + x 2 = 0 and gives, say, [1 — 2t] T . Hence a complex general solution is 


(12*) 


y = *i 


1 

.2/J 


e 2lt -f c 2 



y\ = Cl* 2 * + C 2 e 2lt 

y 2 = 2 — 2 ic 2 e 2jt . 


The next step would be the transformation of this solution to real form by the Euler formula (Sec. 2.2). But we 
were just curious to see what kind of eigenvalues we obtain in the case of a center. Accordingly, we do not 
continue, but start again from the beginning and use a shortcut. We rewrite the given equations in the form 
y'l = y 2 . 4 y x = — y 2 : then the product of the left sides must equal the product of the right sides, 

4y L vi = ~yzy 2 - By integration, 2 y 2 + \y 2 = const. 

This is a family of ellipses (see Fig. 84) enclosing the center at the origin. B 




Fig. 83. Trajectories of the system (11) 
(Saddle point) 


Fig. 84. Trajectories of the system (12) 
(Center) 
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EXAMPLE 5 


EXAMPLE 6 


Spiral Point (Fig. 85) 

A spiral point is a critical point P 0 about which the trajectories spiral, approaching Po a$ f — * 00 ( or tracing 
these spirals in the opposite sense, away from P 0 ). 

The system 


(13) 



thus 


y'i = -yi + V2 

3 r 2 = “Vi - V 2 


has a spiral point at the origin, as we shall see. The characteristic equation is A 2 4- 2A + 2 = 0. It gives the 
eigenvalues —I + i and — 1 — i. Corresponding eigenvectors are obtained from (—1 - X)xi 4* x 2 = 0. For 
A = — 1 + / this becomes — ixj 4* x 2 = 0 and we can take [1 i] r as an eigenvector. Similarly, an eigenvector 
corresponding to - 1 — / is 1 1 — /] T . This gives the complex general solution 


y = ci 


rn 





+ c 2 





The next step would be the transformation of this complex solution to a real general solution by the Euler 
formula. But, as in the last example, we just wanted to see what eigenvalues to expect in the case of a spiral 
point. Accordingly, we start again from the beginning and instead of that rather lengthy systematic calculation 
we use a shortcut. We multiply the first equation in (13) by yj, the second by y 2 , and add, obtaining 

Vl.v! + V2)’2 = -(.Vl 2 + ,V2 2 )- 

We now introduce polar coordinates t, where r z = y z + y z . Differentiating this with respect to / gives 
2 rr = 23^1 4- 2)^2- Hence the previous equation can be written 

rr = — r 2 . Thus, r = — r, drtr = — dt y In \r\ = -/ 4- c*, r = ce~ l . 

For each real c this is a spiral, as claimed, (see Fig. 85). M 



Fig. 85. Trajectories of the system (13) (Spiral point) 


No Basis of Eigenvectors Available. Degenerate Node (Fig. 86) 

This cannot happen if A in (1) is symmetric ( a jy = aj k , as in Examples 1-3) or skew-symmetric ( a ^ = -a^ 
thus cijj = 0). And it does not happen in many other cases (see Examples 4 and 5). Hence it suffices to explain 
the method to be used by an example. 
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Find and graph a general solution of 


(14) 


y' = Ay = 



y- 


Solution . A is not skew-symmetric! Its characteristic equation is 


det (A - AD 


4 - A 
— I 



= A 2 — 6A + 9 


(A - 3) 2 - 0. 


It has a double root A = 3. Hence eigenvectors are obtained from (4 — A)^ -f .v 2 = 0, thus from + .v 2 = 0, 
say. x <D = [1 — 1 1 T and nonzero multiples of it (which do not help). The method now is to substitute 

y (2> = \te Kt + ue At 

with constant u = [mj tt 2 ] r into (14). (The x/-term alone, the analog of what we did in Sec. 2.2 in the case of 
a double root, would not be enough. Try it.) This gives 

y (2> ' = x<?' u + A xte xt + Aue At = Ay (2) = A \te M + Aue At . 

On the right. Ax = Ax. Hence the terms A xte M cancel, and then division by e xt gives 


x + Au = Au, 

Here A = 3 and x = [1 -1] T , so that 


thus 


(A - ADu = x. 


1 

1 

1 i r i" 


«l + U 2 =1 

(A - 3Du = 

u = 

, thus 

L -i 

2 - 3 J L-l 


-«1 - «2 = -1. 


A solution, linearly independent of x = [l -1 J T . is u = [0 1J T . This yields the answer (Fig. 86) 

y = ciy (1) + C2y <2) = o J « 3t + ' + [, ) • 


The critical point at the origin is often called a degenerate node. Cxy (1) gives the heavy straight line, with 
c'i > 0 the lower part and < 0 the upper part of it. y <2) gives the right part of the heavy curve from 0 through 
the second, first, and — finally — fourth quadrants. — y <2) gives the other part of that curve. H 



Fig. 86. Degenerate node in Example 6 




M>] GENERAL SOLUTION 

Find a real general solution of the following systems. (Show 
the details.) 


1. v 1 = 3y z 
y* = 12; Vr 

3. y[ = yi + v 2 
y 2 = yi + )’2 

5- yi = 4y 2 
y 2 = ~4y 1 


2. yi = 5y 2 
y'2 = 5y t 

4. yi = 9yx + 13.5 y 2 
y 2 = 1.5y! + 9.y 2 

6- yi = 2y, - 2y 2 
y 2 = -)’i + 2.y 2 


7. y i = 2.VJ + 8y 2 - 4y 3 

y 2 = -4y, - 10y 2 + 2y 3 
y ' 3 = — 4.V| - 4v 2 - 4y 3 

8. yi = 8yi — y 2 

y 2 = >’i + IO.V2 

9. yi = — y, + y 2 + 0.4y 3 
y 2 = yi ~ 0. ly 2 + 1.4y 3 
y 3 = 0.4yi + 1.4y a + 0.2y 3 

10-151 INITIAL VALUE PROBLEMS 

Solve the following initial value problems. (Show the details.) 


10 . yi = y, + y 2 
y '2 = 4y x + y 2 


11. y'l = yi + 2y 2 
y '2 = ivi + y 2 


yi(0) = 1, y 2 (0) = 6 y,(0) = 16, y 2 (0) = -2 


12. yi = 3y x + 2y 2 
>'2 = 2y v + 3y 2 
y t (0) = 7, y 2 (0) = 7 


13. yi = ivj - 2y 2 
>'2 = -|.Vi + y 2 
y.(0) = 0.4,y 2 (0) = 3.8 


14. yi = -.y, + 5y 2 15. yi = 2yj + 5y 2 


y a = -yi + 3y 2 
y,(0) = 7, y 2 (0) = 2 


y ’2 = 5}-! + 12.5y 2 
>'i(0) = 12, y 2 (0) = 1 


Plane. Qualitative Methods 


I) with three or more equations and a triple eigenvalue 
dent eigenvector, one will get two solutions, as just 
dependent one from 


with v from 


1 6—1 7 1 CONVERSION 

Find a general solution by conversion to a single ODE. 

16. The system in Prob. 8. 

17. The system in Example 5. 

18. (Mixing problem, Fig. 87) Each of the two tanks 
contains 400 gal of water, in which initially 100 lb 
(Tank T x ) and 40 lb (Tank T 2 ) of fertilizer are 
dissolved. The inflow, circulation, and outflow are 
shown in Fig. 87. The mixture is kept uniform by 
stirring. Find the fertilizer contents y^O in T r and y 2 (/) 
in T 2 . 



Fig. 87. Tanks in Problem 18 


19. (Network) Show that a model for the currents Ii(t) and 
I 2 (t) in Fig. 88 is 

h dr + R(h - l 2 ) = 0, LI 2 + R(I Z - I t ) = 0. 

Find a general solution, assuming that R = 20 fl, 
L = 0.5 H, C = 2 • 1(T 4 F. 

20. CAS PROJECT. Phase Portraits. Graph some of the 
figures in this section, in particular Fig. 86 on the 
degenerate node, in which the vector y (2) depends on 
t. In each figure highlight a trajectory that satisfies an 
initial condition of your choice. 


C 



Fig. 88. Network in Problem 19 
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4.4 Criteria for Critical Points. Stability 

We continue our discussion of homogeneous linear systems with constant coefficients 


(1) y' = Ay = 


a n 

_ a 21 



a 22. 


in components. 


y'i = tfnJh + ci 12 y 2 

?2 = a 2iyi + ^ 22 ^ 2 * 


From the examples in the last section we have seen that we can obtain an overview of 
families of solution curves if we represent them parametrically as y(f) = L>^(r) y 2 (t)] J 
and graph them as curves in the y^-plane, called the phase plane. Such a curve is called 
a trajectory of (1), and their totality is known as the phase portrait of (1). 

Now we have seen that solutions are of the form 


y(0 = xe xt . Substitution into (1) gives y'(/) = \xe xt = Ay = Ax* At . 
Dropping the common factor e Kt , we have 


( 2 ) 


Ax = Ax. 


Hence y (t) is a (nonzero) solution of (1) if A is an eigenvalue of A and x a corresponding 
eigenvector. 

Our examples in the last section show that the general form of the phase portrait is 
determined to a large extent by the type of critical point of the system (1) defined as a 
point at which dy 2 ldy 1 becomes undetermined, 0/0; here [see (9) in Sec. 4.3] 

(3) _ y'zdt = a^yi + a 22 y 2 

d y\ y[ dt a iiyi + «123'2 ’ 


We also recall from Sec. 4.3 that there are various types of critical points, and we shall 
now see how these types are related to the eigenvalues. The latter are solutions A = A x 
and A 2 of the characteristic equation 


( 4 ) 


det (A - AI) = 


- A 


a 21 


a 12 


a 22 ~ ^ 


= A 2 — (i a n + a 2 z ) A -f det A = 0. 


This is a quadratic equation A 2 - pX + q = 0 with coefficients p, q and discriminant A 
given by 


(5) p = a 1± + a 2 2 , q = det A = a n a 22 ~ a 12 a 2X , A = p 2 - 4 q. 


From calculus we know that the solutions of this equation are 

(6) a x = Up + vSj. a 2 = Up - VA). 

Furthermore, the product representation of the equation gives 

A 2 - p\ + q = (A — Ai)(A - A*) = A 2 - (A x + A 2 )A + AjAg. 



148 


CHAP. 4 Systems of ODEs. Phase Plane. Qualitative Methods 


Hence p is the sum and q die product of the eigenvalues. Also A x — A 2 = VA from (6). 
Together, 

(7) p = Aj + A 2 , q — AiA 2 , A = (A* — A 2 ) 2 . 

This gives die criteria in Table 4.1 for classifying critical points. A derivation will be 
indicated later in this section. 


Table 4.1 Eigenvalue Criteria for Critical Points 
(Derivation after Table 4.2) 


Name 

p = Aj + A 2 

q — A x A 2 

> 

II 

>> 

>-* 

1 

£ 

'To 

Comments on A x , A 2 

(a) Node 


q> 0 

ASO 

Real, same sign 

(b) Saddle point 


q < 0 


Real, opposite sign 

(c) Center 

p = 0 

q > 0 


Pure imaginary 

(d) Spiral point 

p =£ 0 


A < 0 

Complex, not pure 
imaginary 


Stability 

Critical points may also be classified in terms of their stability. Stability concepts are basic 
in engineering and other applications. They are suggested by physics, where stability 
means, roughly speaking, that a small change (a small disturbance) of a physical system 
at some instant changes the behavior of the system only slightly at all future dmes /. For 
critical points, the following concepts are appropriate. 


DEFINITIONS 


Stable, Unstable, Stable and Attractive 

A critical point P 0 of (1) is called stable 2 if, roughly, all trajectories of (1) that at 
some instant are close to P 0 remain close to P 0 at all future times; precisely: if for 
every disk D e of radius e > 0 with center P 0 there is a disk D s of radius 8 > 0 with 
center P 0 such that every trajectory of (1) that has a point P x (corresponding to 
t = t l9 say) in D s has all its points corresponding to t ^ in D e . See Fig. 89. 

P 0 is called unstable if P 0 is not stable. 

P 0 is called stable and attractive (or asymptotically stable) if P 0 is stable and 
every trajectory that has a point in D 8 approaches P 0 as t—> See Fig. 90. 


Classification criteria for critical points in terms of stability are given in Table 4.2. Both 
tables are summarized in the stability chart in Fig. 91. In this chart the region of instability 
is dark blue. 


2 In the sense of the Russian mathematician ALEXANDER M1CHAILOVICH UAPUNOV (1 857-1 9 IS), 
whose work was fundamental in stability theory for ODEs. This is perhaps the most appropriate definition of 
stability (and the only we shall use), but there are others, too. 
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EXAMPLE 1 



Fig. 89. Stable critical point P 0 of (1) (The trajectory Fig. 90. Stable and attractive critical 
initiating at stays in the disk of radius e) point P 0 of (1) 


Table 4.2 Stability Criteria for Critical Points 


Type of Stability 

p = A x + A 2 

q — A X A 2 

(a) Stable and attractive 

(b) Stable 

(c) Unstable 

^3 ^3 ^3 

V IIA A 
o o o 

O 

q > 0 
q > 0 
R q < 0 



Fig. 91. Stability chart of the system (1) with p, q, A defined in (5). 

Stable and attractive: The second quadrant without the q-axls. 
Stability also on the positive q-axis (which corresponds to centers). 
Unstable: Dark blue region 


We indicate how the criteria in Tables 4.1 and 4.2 are obtained. If q = > 0, 

both eigenvalues are positive or both are negative or complex conjugates. If also 
p = A x + A 2 < 0, both are negative or have a negative real part. Hence P 0 is stable 
and attractive. The reasoning for the other two lines in Table 4.2 is similar. 

If A < 0, the eigenvalues are complex conjugates, say, A x = a + ifi and A 2 = a. — i/3. 
If also p = A x + A 2 = 2a < 0, this gives a spiral point that is stable and attractive. If 
p = 2a > 0, this gives an unstable spiral point. 

If p = 0, then A 2 = — A x and q = A X A 2 = -A x 2 . If also q > 0, then A x 2 = — q < 0, 
so that A x , and thus A 2 , must be pure imaginary. This gives periodic solutions, their 
trajectories being closed curves around P 0> which is a center. 


Application of the Criteria in Tables 4.1 and 4.2 


In Example l t Sec. 4.3, we have y* 



is stable and attractive by Table 4.2(a). 


8, A = 4, a node by Table 4. 1 (a), which 
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EXAMPLE 2 Free Motions of a Mass on a Spring 

What kind of critical point does my n + cy f + ky = 0 in Sec. 2.4 have? 

Solution . Division by m gives y H — —{khn)y — ( c/m)y f . To get a system, set y x = y, y 2 — y (see Sec. 
4*1). Then y2 — y" — —(khn)y 1 — (c/m)y 2 - Hence 


r 0 1 i 


-A 1 

= y, 

del (A - AI) = 


--khn -dmJ 


1 

1 

1 

1 


9 c k 
= A 2 + — A + — = 0. 


We see that p = ~c/m. q = khn, A = {dm) 2 — 4 khn. From this and Tables 4.1 and 4.2 we obtain the following 
results. Note that in the last three cases the discriminant A plays an essential role. 


No damping . c = 0, p = 0. q > 0, a center. 

Underdamping, c 2 < 4mk. p < 0, q > 0, A < 0, a stable and attractive spiral point. 

Critical damping, c 2 = 4 mk, p < 0, q > 0, A = 0, a stable and attractive node. 

Overdamping, c 2 > 4 mk, p < 0, q > 0, A > 0, a stable and attractive node. B 





TYPE AND STABILITY OF CRITICAL POINT 


Determine the type and stability of the critical point. Then 
find a real general solution and sketch or graph some of the 
trajectories in the phase plane. (Show the details of your 
work.) 


i. y[ = 2 v 2 

y 2 = 8>’j 

3. y[ = 2y x + y z 

yz= yi + 2}’2 

5. y[ = — 4yj + y 2 

y 2 = 3 1 ! - 4y 2 

7. y[ = -2 y 2 

y 2 = 8yj 


2 - y'l = 4y x 
y 2 = 3y 2 

4 - y{ = )’2 

y'2 = -5>’x - 2y 2 

6. yj = y x + I0y 2 
y 2 = 7.Vi ~ 8y 2 

8 - y'i = 3yj + 5y 2 
.V 2 = _ 5y, - 3y 2 


9- y'i = yi + 2y 2 
y '2 = 2y x + y 2 
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FORM OF TRAJECTORIES 


What kind of curves are the trajectories of the following 
ODEs in the phase plane? 

10. y" 4- 5y f = 0 

11. y" - k 2 y = 0 

12. y* + jgy = 0 


13. (Damped oscillation) Solve y" 4- 4y' 4- 5 y = 0. What 
kind of curves do you get as trajectories? 


14. (Transformation of variable) What happens to the 
system (1) and its critical point if you introduce 7 = —t 
as a new independent variable? 

15. (Types of critical points) Discuss the critical points in 
(10)-(14) in Sec. 4.3 by applying the criteria in Tables 
4.1 and 4.2 in this section. 

16. (Perturbation of center) If a system has a center as 
its critical point, what happens if you replace the matrix 
A by A = A 4- kl with any real number k =£ 0 
(representing measurement errors in the diagonal 
entries)? 

17. (Perturbation) The system in Example 4 in Sec. 4.3 
has a center as its critical point. Replace each cij k in 
Example 4, Sec. 4.3, by aj k 4- b. Find values of b such 
that you get (a) a saddle point, (b) a stable and attractive 
node, (c) a stable and attractive spiral, (d) an unstable 
spiral, (e) an unstable node. 

18. CAS EXPERIMENT. Phase Portraits. Graph phase 
portraits for the systems in Prob. 17 with the values of 
b suggested in the answer. Try to illustrate how the phase 
portrait changes “continuously” under a continuous 
change of b. 

19. WRITING EXPERIMENT. Stability. Stability 
concepts are basic in physics and engineering. Write a 
two-part report of 3 pages each (A) on general 
applications in which stability plays a role (be as 
precise as you can), and (B) on material related to 
stability in this section. Use your own formulations and 
examples; do not copy. 

20. (Stability chart) Locate the critical points of die 
systems ( 10>— ( 14) in Sec. 4.3 and of Probs. 1, 3, 5 in 
this problem set on the stability chart. 
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4 .! Qualitative Methods for Nonlinear Systems 

Qualitative methods are methods of obtaining qualitative information on solutions 
without actually solving a system. These methods are particularly valuable for systems 
whose solution by analytic methods is difficult or impossible. This is the case for many 
practically important nonlinear systems 


( 1 ) 


y' = f(y), thus 


y[ = / iO'i> ^2) 

yL = fziyi, y& 


In this section we extend phase plane methods, as just discussed, from linear systems 
to nonlinear systems (1). We assume that (1) is autonomous, that is, the independent 
variable t does not occur explicitly. (All examples in the last section are autonomous.) 
We shall again exhibit entire families of solutions. This is an advantage over numeric 
methods, which give only one (approximate) solution at a time. 

Concepts needed from the last section are the phase plane (they^-plane), trajectories 
(solution curves of (1) in the phase plane), the phase portrait of (1) (the totality of these 
trajectories), and critical points of (1) (points (y v y 2 ) at which both fi(y x , y 2 ) and f 2 (y x , y 2 ) 
are zero). 

Now (1) may have several critical points. Then we discuss one after another. As a 
technical convenience, each time we first move the critical point P 0 : (a, b) to be considered 
to the origin (0, 0). This can be done by a translation 

Ji = yi~ a, y 2 = y2- b 

which moves P 0 to (0, 0). Thus we can assume P Q to be the origin (0, 0), and for 
simplicity we continue to write y lt y 2 (instead of y x , y 2 ). We also assume that P Q is 
Isolated, that is, it is the only critical point of (1) within a (sufficiently small) disk with 
center at the origin. If (1) has only finitely many critical points, this is automatically 
true. (Explain!) 


Linearization of Nonlinear Systems 

How can we determine the kind and stability property of a critical point P 0 : (0, 0) of 
(1)? In most cases this can be done by linearization of (1) near P 0 , writing (1) as 
y f = f(y) = Ay + h(y) and dropping h(y), as follows. 

Since P 0 is critical, f x ( 0, 0) = 0, / 2 (0, 0) = 0, so that f x and f 2 have no constant terms 
and we can write 


(2) y' = Ay + h(y). 


thus 


yi = a nyi + «i2)’2 + Aita. y2) 

)'2 = <3 2 l3’l + «22>'2 + /* 2 Ol, J' 2 )- 


A is constant (independent of t) since (1) is autonomous. One can prove the following 
(proof in Ref. [A7], pp. 375-388, listed in App. 1). 
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THEOREM 1 


EXAMPLE 1 


Linearization 

If f i and f 2 in (1) are continuous and have continuous partial derivatives in a 
neighborhood of the critical point P 0 : (0, 0), and if det A i=- 0 in (2), then the kind 
and stability of the critical point of { 1 ) are the same as those of the linearized 
system 

Vi = tfnVi + a 12 )’2 

(3) y = Ay, thus 

.V2 = «21.Vl + «223’2- 

Exceptions occur if A has equal or pure imaginary eigenvalues; then (1) may have 
the same kind of critical point as (3) or a spiral point. 


Free Undamped Pendulum. Linearization 

Figure 92a shows a pendulum consisting of a body of mass m (the bob) and a rod of length L. Determine the 
locations and types of the critical points. Assume that the mass of the rod and air resistance are negligible. 

Solution . Step I. Setting up the mathematical model. Let 8 denote the angular displacement, measured 
counterclockwise from the equilibrium position. The weight of the bob is mg (g the acceleration of gravity). It 
causes a restoring force mg sin 0 tangent to the curve of motion (circular arc) of the bob. By Newton's second 
law, at each instant this force is balanced by the force of acceleration niLO H , where L0" is the acceleration; 
hence the resultant of these two forces is zero, and we obtain as the mathematical model 


mLO " + mg sin 6 = 0. 

Dividing this by mL , we have 

(4) 0" + *sin0=O ( A= i)" 

When 6 is very small, we can approximate sin 8 rather accurately by 6 and obtain as an approximate solution 
A cos VT/ + B sin Vfc/, but the exact solution for any 0 is not an elementary function. 

Step 2. Critical points (0, 0), ±(2tt, 0), ±(477, 0), • • • , Linearization. To obtain a system of ODEs, we set 
Q = .V|, $' = y 2 - Then from (4) we obtain a nonlinear system (I) of the form 


(4*) 


>*i ~ /lO’l* V 2 ) “ V2 
y 2 = / 2 OT' .V 2 ) = ~k sin Vi . 


The right sides are both zero when y 2 = 0 and sin v*i — 0. This gives infinitely many critical points (//tt, 0), 
where n = 0, ± 1, ±2, • - * . We consider (0, 0). Since the Maclaurin series is 

sin >! = v, - l v’! 3 + .Vi, 

the linearized system at (0. 0) is 

, r 0 11 y[ = .v 2 

y = Ay = y, thus 

L-Jt oJ v 2 = -ky v 

To apply our criteria in Sec. 4.4 we calculate p - a u + a 2 2 - 0, q = det A - k — g/L (> 0) f and 
A = p 2 — 4q — -4k. From this and Table 4.1(c) in Sec. 4.4 we conclude that (0, 0) is a center, which is always 
stable. Since sin 6 = sin .V! is periodic with period 2 tt, the critical points ( jitt , 0), ;; = ±2, ±4, ■ ■ • , are all 
centers. 

Step 3. Critical points ±( 77 , 0), ±(3rr, 0), ±(577, 0), • • • , Linearization. We now consider the critical point 
(77, 0), setting Q - tt - Vj and (6 - wf = $’ = y 2 . Then in (4). 

sin 6 = sin (vj. 4 77) = -sinyj = -yj + - Vl 
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EXAMPLE 2 


and the linearized system at (t r, 0) is now 


y' = Ay = 


"0 
_i k 



thus 


y{ ~ . v 2 
y2 = *vi- 


We see that p = 0, q = -k (< 0), and A = —Aq = Ak. Hence, by Table 4.1(b), this gives a saddle point, which 
is always unstable. Because of periodicity, the critical points (ntt, 0 )> n = ±1, ±3, * • ■ , are all saddle points. 
These results agree with the impression we get from Fig. 92b. I 



(a) Pendulum 



(b) Solution curves y 2 (yj) of (4) in the phase plane 
Fig. 92. Example 1 (C will be explained in Example 4.) 


Linearization of the Damped Pendulum Equation 

To gain further experience in investigating critical points, as another practically important case, let us see how 
Example 1 changes when we add a damping term cO' (damping proportional to the angular velocity) to equation 

(4) , so that it becomes 

(5) e" + cB' + k sin 6= 0 

where k > 0 and c ^ 0 (which includes our previous case of no damping, c = 0). Setting 0 = O' — y%> as 
before, we obtain the nonlinear system (use 6 n = y^) 

y'i = >’2 

y 2 = -A sin y 1 - cy 2 . 

We see that the critical points have the same locations as before, namely. (0, 0), (±7 r, 0), (±277, 0), • * • . We 
consider (0, 0). Linearizing sin y± Vj as in Example 1 , we get the linearized system at (0, 0) 

, f 0 H y[ = y 2 

(6) y = Ay = y, thus 

L—k — cJ y 2 = ~h f i ~ 02- 

This is identical with the system in Example 2 of Sec 4.4, except for the (positive!) factor m (and except for 
the physical meaning of yi). Hence for c = 0 (no damping) we have a center (see Fig. 92b). for small damping 
we have a spiral point (see Fig. 93), and so on. 

We now consider the critical point ( 7 r, 0). We set 0 — tt = y lt ( 6 — if)' = 0* = y 2 and linearize 
sin 0 — sin + 7r) = —sin yj -yj. 


This gives the new linearized system at (77, 0) 



thus 


y\ = y% 

V2 = ^’1 “ 0 ’ 2 - 
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EXAMPLE 3 


For our criteria in Sec 4.4 we calculate p = + a 22 = ~ c - V = det A = — k , and A = p 2 — Aq = c 2 + 4k. 

This gives the following results for the critical point at (it, 0). 

No damping . c = 0. p = 0. q < 0. A > 0. a saddle point. See Fig. 92b. 

Damping, c > 0, p < 0, q < 0, A > 0, a saddle point. See Fig. 93. 

Since sin y x is periodic with period 2 tt, the critical points (±27 t, 0) t (±4ir t 0), * • • are of the same type as 
(0, 0), and the critical points ( — -tt. 0). (±3w, 0). • • • are of the same type as (tt, 0). so that our task is Finished. 

Figure 93 shows the trajectories in the case of damping. What we see agrees with our physical intuition. Indeed, 
damping means loss of energy. Hence instead of the closed trajectories of periodic solutions in Fig. 92b we now 
have trajectories spiraling around one of the critical points (0, 0). (±2 < tt, 0), • • * . Even the wavy trajectories 
corresponding to whirly motions eventually spiral around one of these points. Furthermore, there are no more 
trajectories that connect critical points (as there were in the undamped case for the saddle points). ■ 



Fig. 93. Trajectories in the phase plane for the damped pendulum 
in Example 2 


Lotka-Volterra Population Model 

Predator-Prey Population Model 3 

This model concerns two species, say. rabbits and foxes, and the foxes prey on the rabbits. 

Step 1 . Setting up the model We assume the following. 

1. Rabbits have unlimited food supply. Hence if there were no foxes, their number y^/) would grow 
exponentially, y[ = a\'i. 

2. Actually, y x is decreased because of the kill by foxes, say, at a rate proportional to where y 2 (/) is 

the number of foxes. Hence y{ = ayi - byiy 2 , where a > 0 and b > 0. 

3. If there were no rabbits, then y 2 (/) would exponentially decrease to zero. y 2 = — /y 2 - However, y 2 is 
increased by a rate proportional to the number of encounters between predator and prey; together we 
have y 2 = - (v 2 + AtlV 2 . where k > 0 and / > 0. 

This gives the (nonlinear!) Lotka-Volterra system 

„ y'i = ZiO'i. >’2) = <*yi - i>yi>'2 

( 7 ) 

v 2 = / 2 0’1i J 2 ) = “ l >'2 • 


3 Introduced by ALFRED J. LOTKA (1880-1949). American biophysicist, and VITO 
(1860-1940), Italian mathematician, the initiator of functional analysis (see fGR7] in App. 1). 


VOLTERRA 
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Step 2. Critical point (0, 0), Linearization . We see from (7) that the critical points are the solutions of 
C 7 *) fi(yi> = )’i(a - by 2 ) = 0, / 2 0’1. >'2) = >’2(*>’1 -0 = 0. 


The solutions are (y L , y 2 ) = (0, 0) and ( -^ , y ). We consider (0, 0). Dropping —by x y 2 and ky x y 2 from (7) gives 
the linearized system 



y- 


Its eigenvalues are A x = a > 0 and A 2 = — / < 0. They have opposite signs, so that we get a saddle point. 

Step 3. Critical point (//ft, alb\ Linearization. We set y x = yi + f/ft» >’2 “ ?2 + °!b. Then the critical point 
(//ft, <z/ft) corresponds to (y iy y 2 ) = (0, 0). Since y[ = yj, y 2 = v 2 , we obtain from (7) [factorized as in (8)] 


y'i = ^1+ ^ - *(.V2 + 

yL= (* + f ) [*(* + j) 


f)] = ( yi+ i) ( ' 65y 

" '] = ( ?2+ f K 


Dropping the two nonlinear terms — fty 1# v 2 and ky{y 2 , we have the linearized system 


(7**) 


, , lb _ 

(a) y t = - — y 2 

n oK 

(b) ,V2 = y .vi- 


The left side of (a) times the right side of (b) must equal the right side of (a) times the left side of (b). 


ak , lb , _ . 

— = ~ -j-yzy* B >' integration, 



lb o 

— Vo = const. 
ft ' 


This is a family ellipses, so that the critical point (//ft, alb) of the linearized system (7**) is a center (Fig. 94). 
It can be shown by a complicated analysis that the nonlinear system (7) also has a center (rather than a spiral 
point) at (//ft, alb) surrounded by closed trajectories (not ellipses). 

We see that the predators and prey have a cyclic variation about the critical point. Let us move counterclockwise 
around the ellipse, beginning at the right vertex, where the rabbits have a maximum number. Foxes are sharply 
increasing in number until they reach a maximum at the upper vertex, and the number of rabbits is then sharply 
decreasing until it reaches a minimum at the left vertex, and so on. Cyclic variations of this kind have been 
observed in nature, for example, for lynx and snowshoe hare near the Hudson Bay, with a cycle of about 10 
years. 

For models of more complicated situations and a systematic discussion, see C. W. Clark, Mathematical 
Bioeconomics (Wiley, 1976). I 



Fig. 94. Ecological equilibrium point and trajectory 
of the linearized Lotka-Volterra system (7**) 
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EXAMPLE 4 


Transformation to a First-Order Equation 
in the Phase Plane 

Another phase plane method is based on the idea of ttansforming a second-order 
autonomous ODE (an ODE in which t does not occur explicitly) 

F(y,y\y") = 0 

to first order by taking y = as the independent variable, setting y f = y 2 and transforming 
y" by the chain rule, 

n , dy 2 _ dy 2 dy x dy 2 

y ' 2 dt d\’i dr dy j. >2 ' 

Then the ODE becomes of first order, 

(» f-(.v 1 .v 2 ,g-.v a )-0 


and can sometimes be solved or treated by direction fields. We illustrate this for the 
equation in Example 1 and shall gain much more insight into the behavior of solutions. 


An ODE (8) for the Free Undamped Pendulum 

If in (4) 0 ” + k sin 6 = 0 we sel 0 = d' = _v 2 (ihe angular velocity) and use 


0” = 


nr 


<i > f 2 dy 1 
chu dt 


dy? 

dy 


~ V2’ 


we gel 


d\ f 2 , . 

— y 2 = -A sin V].. 


UT 


dy 


'Vi 


Separation of variables gives v 2 dy 2 — —k sin Vj dy 1# By integration. 

(9) 2 . v 2 2 = ^ cos >’i + C (C constant). 

Multiplying this by mL 2 , we get 


|/??(Lv 2) 2 — wL 2 fc cos Vi — ml?C. 

We see that these three terms are energies. Indeed. y 2 is die angular velocity, so that Lv 2 is the velocity and the 
first term is the kinetic energy. The second term (including the minus sign) is the potential energy of the pendulum, 
and mL?C is its total energy, which is constant, as expected from the law of conservation of energy, because 
there is no damping (no loss of energy). The type of motion depends on the total energy, hence on C. as follows. 

Figure 92b on p. 153 shows trajectories for various values of C. These graphs continue periodically with 
period 2 tt to the left and to the right. We see that some of them are ellipse-like and closed, others are wavy, 
and there are two trajectories (passing through the saddle points (/jtt, 0), n = ±1, ±3, • • ■ ) that separate 
those two types of trajectories. From (9) we see that the smallest possible C is C - —k; then y 2 = 0, and 
cos Vj = 1, so that the pendulum is at rest. The pendulum will change its direction of motion if there are points 
at which _y 2 = = 0. Then k cos Vj + C = 0 by (9). If y± = 7 r. then cos Vj = — l and C — k. Hence if 

-k < C < k . then the pendulum reverses its direction for a | _v l | * |0| < tt. and for these values of C with 
|C| < k the pendulum oscillates. This corresponds to the closed trajectories in the figure. However, if C > k, 
then v 2 = 0 is impossible and the pendulum makes a whirly motion that appears as a wavy trajectory in the 
yiv 2 -plane. Finally, the value C — k corresponds to the two “separating trajectories’* in Fig. 92b connecting the 
saddle points. ■ 

The phase plane method of deriving a single first-order equation (8) may be of practical interest 
not only when (8) can be solved (as in Example 4) but also when solution is not possible and 
we have to utilize direction fields (Sec. 1 .2). We illustrate this with a very famous example: 
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EXAMPLE 5 Self-Sustained Oscillations. Van der Pol Equation 

There are physical systems such that for small oscillations, energy is fed into the system, whereas for large 
oscillations, energy is taken from the system. In other words, large oscillations will be damped, whereas for 
small oscillations there is “negative damping” (feeding of energy into the system). For physical reasons we 
expect such a system to approach a periodic behavior, which will thus appear as a closed trajectory in the phase 
plane, called a limit cycle. A differential equation describing such vibrations is the famous van der Pol 
equation 4 

(10) y" — /x( I — y 2 )/ + y — 0 (fx > 0, constant). 


It first occurred in the study of electrical circuits containing vacuum tubes. For ix = 0 this equation becomes 
y" + v = 0 and we obtain harmonic oscillations. Let fx > 0. The damping term has the factor -ja(1 — y 2 ). 
This is negative for small oscillations, when y 2 < 1 , so that we have “negative damping,” is zero for y 2 = 1 (no 
damping), and is positive if y 2 > J (positive damping, loss of energy). If /x is small, we expect a limit cycle 
that is almost a circle because then our equation differs but little from y" + y = 0. If fx is large, the limit 
cycle will probably look different 

Setting y = Vj, y' = y 2 and using y n - (dy 2 /dy 1 )y 2 as in (8), we have from (10) 


(ID 


l v2 o 

— ,V 2 - MI - .Vj )>'2 + .Vi = o. 

dyi 


The isoclines in the yiJfe-plane (the phase plane) are the curves dy 2 /dy x = K = const, that is. 


dy 2 
d)i 


= M(l - .Vi 2 ) - — = K. 

>’2 


Solving algebraically for y 2 , we see that the isoclines are given by 


y% /x(l -V, 2 ) - K 


(Figs. 95, 96). 



Fig. 95. Direction field for the van der Pol equation with fx = 0.1 in the phase plane, 
showing also the limit cycle and two trajectories. See also Fig. 8 in Sec. 1.2. 


4 BALTHASAR VAN DER POL (1889-1959), Dutch physicist and engineer. 
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Figure 95 shows some isoclines when fx is small, fx = 0.1, the limit cycle (almost a circle), and two (blue) 
trajectories approaching it. one from the outside and the other from the inside, of which only the initial portion, 
a small spiral, is shown. Due to this approach by trajectories, a limit cycle differs conceptually from a closed 
curve (a trajectory) surrounding a center, which is not approached by trajectories. For larger fx the limit cycle 
no longer resembles a circle, and the trajectories approach it more rapidly than for smaller fx. Figure 96 illustrates 
this for fx = 1 . ■ 



Fig. 96. Direction field for the van der Pol equation with fx = 1 in the phase plane, 
showing also the limit cycle and two trajectories approaching it 
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CRITICAL POINTS, LINEARIZATION 


Determine the location and type of all critical points by 
linearization. In Probs. 7-12 first transform the ODE to a 
system. (Show the details of your work.) 

1* y'l ~ yz + >’2 2 2 - vi = 4y, - V, 2 

y 2 = 3 )’! y 2 = y 2 


3- y'l = 4v 2 

)' 2 = -)’i — y* 

s - y'i = -yi + y 2 ~ y 2 
y 2 = -vi - y 2 

7. y" + y - 4v 2 = 0 


4. .vl = -3}>! + v 2 - y 2 
y'z = Vi - 3y 2 

6 - y[ = y 2 ~ y 2 
y 2 = yi - yi 2 

8. y" + 9v + y 2 = 0 


9. y" + cos y = 0 10. y" + sin y — 0 

11. y" + 4^ - y 3 = 0 12. y" + y' + 2y - y 2 = 0 

13. (Trajectories) What kind of curves are the trajectories 
of yy" + 2y' 2 = 0? 

14. (Trajectories) Write the ODE y" — 4y + y 3 = 0 as a 
system, solve it for v 2 as a function of y lt and sketch 
or graph some of the trajectories in the phase plane. 

15. (Trajectories) What is the radius of a real general 
solution of y" + y = 0 in the phase plane? 

16. (Trajectories) In Prob. 14 add a linear damping term 
to get y" + 2 y' —4 y + y 3 = 0. Using arguments from 
mechanics and a comparison with Prob. 14, as well as 
with Examples 1 and 2, guess the type of each critical 
point. Then determine these types by linearization. 
(Show all details of your work.) 
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17. (Pendulum) To what state (position, speed, direction 
of motion) do the four points of intersection of a 
closed trajectory with the axes in Fig. 92b correspond? 
The point of intersection of a wavy curve with the 
y 2 -axis? 

18. (Limit cycle) What is the essential difference between 
a limit cycle and a closed trajectory surrounding a 
center? 

19. CAS EXPERIMENT. Deformation of Limit Cycle. 
Convert the van der Pol equation to a system. Graph 
the limit cycle and some approaching trajectories for 
1 1 = 0.2, 0.4, 0.6, 0.8, 1.0, 1.5, 2.0. Try to observe how 
the limit cycle changes its form continuously if you 
vary ft continuously. Describe in words how the limit 
cycle is deformed with growing p. 

20. TEAM PROJECT. Self-sustained oscillations, 
(a) Van der Pol Equation. Determine the type of the 
critical point at (0, 0) when p > 0, p = 0, p < 0. 
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Show that if p — > 0, the isoclines approach straight 
lines through the origin. Why is this to be expected? 

(b) Rayleigh equation. Show that the so-called 
Rayleigh equation 5 

y" - p(\ - £y' 2 )r' + y = o (p > o) 

also describes self-sustained oscillations and that by 
differentiating it and setting y = Y f one obtains the van 
der Pol equation. 

(c) Duffing equation. The Duffing equation is 

y” 4- o) 0 2 y 4- f3y 3 = 0 

where usually |/3| is small, thus characterizing a small 
deviation of the restoring force from linearity. /3 > 0 
and p < 0 are called the cases of a hard spring and a 
soft spring , respectively. Find the equation of the 
trajectories in the phase plane. (Note that for ft > 0 all 
these curves are closed.) 


4.6 Nonhomogeneous Linear Systems of ODEs 

In this last section of Chap. 4 we discuss methods for solving nonhomogeneous linear 
systems of ODEs 

(1) y' = Ay 4- g (see Sec. 4.2) 

where the vector g(/) is not identically zero. We assume g(/) and the entries of the n X n 
matrix A (/) to be continuous on some interval J of the /-axis. From a general solution 
y 0l \t) of the homogeneous system y' = Ay on J and a particular solution y (p) (f) of 

(1) on J [i.e., a solution of (1) containing no arbitrary constants], we get a solution 
of(l), 

(2) y = y (,l) + y ( P\ 

y is called a general solution of (1) on J because it includes every solution of (1) on J . 
This follows from Theorem 2 in Sec. 4.2 (see Prob. 1 of this section). 

Having studied homogeneous linear systems in Secs. 4. 1-4.4, our present task will be 
to explain methods for obtaining particular solutions of (1). We discuss the method of 
undetermined coefficients and the method of the variation of parameters; these have 
counterparts for a single ODE, as we know from Secs. 2.7 and 2.10. 


5 L0RD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919). great English physicist and mathematician, 
professor at Cambridge and London, known by his important contributions to the theory' of was r es, elasticity 
theory, hydrodynamics, and various other branches of applied mathematics and theoretical physics. In 1904 he 
received the Nobel Prize in physics. 
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EXAMPLE 1 


Method of Undetermined Coefficients 

As for a single ODE, this method is suitable if the entries of A are constants and the 
components of g are constants, positive integer powers of t , exponential functions, or 
cosines and sines. In such a case a particular solution y (p) is assumed in a form similar 
to g; for instance, y (p) = u + \t + y/t 2 if g has components quadratic in r, with u, v, w 
to be determined by substitution into (1). This is similar to Sec. 2.7, except for the 
Modification Rule. It suffices to show this by an example. 


Method of Undetermined Coefficients. Modification Rule 

Find a general solution of 

( 3 ) y' = Ay 4 g = i 3 y + C ~ 2t 

Solution . A general equation of the homogeneous system is (see Example 1 in Sec. 4.3) 


(4) 


„(h) 


1 —2 t , ^ —4 t 

= Cl e 4 c 2 e \ 

_i J L-i. 


Since A = —2 is an eigenvalue of A, the function e“ 2t on the right also appears in y (,i \ and we must apply the 
Modification Rule by setting 

y (p) = u te~ 2t 4 ve“ 2t (rather than ue -2t ). 


Note that the first of these two terms is the analog of the modification in Sec. 2.7, but it would not be sufficient 
here. (Try it.) By substitution. 


y (p> ' = ue~ 2t - 2u te 2t - 2ve 2t = Au/e 21 4 Ave 2t 4 g. 

Equating the ;e“ 2t -terms on both sides, we have -2u = Au. Hence u is an eigenvector of A corresponding to 
A = -2; thus [see (5)] u = a[i 1] T with any a =£ 0. Equating the other terms gives 


’-6 a r 2u ii + y 2 i r-6- 

u — 2v = Av 4 thus — = 4 

2j \_aj lv 2 v 1 — 3v 2 2 


Collecting terms and reshuffling gives 


Vi - v 2 = -a - 6 
—Vi 4 u 2 = —a 4 2. 


By addition, 0 = —2 a - 4, a = -2, and then v 2 38 t/j 4 4, say, Uj - k, v 2 = k + 4, thus, v = [k k 4 4] T . 
We can simply choose k = 0. This gives the answer 


(5) 


y = y iM 4 y (p) = c x e~ 2t 4 c 2 e~ 4t - 2 te~ 2t 

Lu L-u Lu 


For other k we get other v; for instance, k = -2 gives v = [-2 2] T , so that the answer becomes 


etc. 


Method of Variation of Parameters 

This method can be applied to nonhomogeneous linear systems 


( 6 ) 


y' = A(/)y + g(f) 



SEC. 4.6 Nonhomogeneous Linear Systems of ODEs 
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EXAMPLE 2 


with variable A = A (r) and general g(r). It yields a particular solution y (p) of (6) on some 
open interval J on the f-axis if a general solution of the homogeneous system y f = A(/)y 
on J is known. We explain the method in terms of the previous example. 


Solution by the Method of Variation of Parameters 

Solve (3) in Example l. 

Solution . A basis of solutions of the homogeneous system is U” 2t e“ 2t ] T and U~ 4t -e -4t ] T . Hence 

the general solution (4) of the homogenous system may be written 


(7) 


y (,l) = 


r — 2 t — 4t~i 


- 

e e 



% 

1 

« 

1 



L* J 


S2. 


= Y(f)c. 


Here, Y(/) = [y a> y <2) ] T is the fundamental matrix (see Sec. 4.2). As in Sec. 2.10 we replace the constant 
vector c by a variable vector u(/) to obtain a particular solution 


y<P> « Y0)U(/). 


Substitution into (3) y r = Ay + g gives 

(8) Y'u + Yu' = AYu + g. 

Now since y (1) and y (2) are solutions of the homogeneous system, we have 

y a)/ = Ay (1) . y (2 >' = Ay (2) . thus Y' = AY. 

Hence Y'u = AYu, so that (8) reduces to 


Yu' = g. The solution is u' = Y x g; 

here we use that the inverse Y _1 of Y (Sec. 4.0) exists because the determinant of Y is the Wronskian W, which 
is not zero for a basis. Equation (9) in Sec. 4.0 gives the form of Y”\ 


1 


-e -4£ " 

1 

> 

e 2t ' 

II 

1 

NJ 

1 

9 


,- 2t . 

” 2 

_e 4t 

-e 4t . 


We multiply this by g, obtaining 


u' = Y _1 g = - 


* 2 ‘i r- 6 ^ 2t i = i r - 4 1 = r - 2 1 

- 4t -e 4t \ L 2e~ 2t i 2 L-8e 2t J L~4e 2t J ' 


Integration is done componentwise (just as differentiation) and gives 


u(/) = / \ 2 "i dt = r 2 

J 0 L-4c 2t J L— 2e 2t + 2J 


(where + 2 comes from the lower limit of integration). From this and Y in (7) we obtain 

-2t , 0 -4t“ 


Yu = 


” 2t <?“ 4t l r -2/ 1 f -2 te~ 2t ~ 2e 2t + 2e 4t "| f-2 / - 21 f 21 

~ 2t _-le 2t + 2. .-2(e~ 2t + 2e~ 2t - 2e~ 4t _ L-2r + 2J L-2J 


The last term on the right is a solution of the homogeneous system. Hence we can absorb it into y (tl \ We thus 
obtain as a general solution of the system (3), in agreement with (5*), 


y = ci 


"r 

-i- 


—2t , „ 

e + c 2 



- 2 



+ 



e 


—2t 


(9) 
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1. (General solution) Prove that (2) includes every 
solution of (1). 


2-9 


GENERAL SOLUTION 

Find a general solution. (Show the details of your work.) 
2. y'\ = ,V 2 + 1 3. y[ = 4_y 2 + 9/ 

y'i. = >’i - 3r )’2 = -4v x + 5 


4. y'\ ~ Vi + y 2 + 5 cos / 5. = 2j'! + 2y 2 + 12 

y'z = 3yj - y 2 - 5 sin 1 y 2 = 5yi - y 2 - 30 
6 . y\. = -yi + y 2 + e~ 2t 
yk = -)’i “ .V2 “ e~ 2t 


14. yi = 3y, — 4y 2 + 20 cos t 
y'z=y\ ~ 2y z 

>’i(0) = 0, y 2 (0) = 8 

15. yi = 4y 2 + 3e 3t 
y'z = 2y 2 - I5e _3t 

yi(0) = 2, y 2 (0) = 2 


16. yi 
/ 

Ji(0) 


= 4 Vi -1 - 8.v 2 + 2 cos / — 1 6 sin / 
= 6y ± 4- 2y 2 + cos t — 1 4 sin / 

= 15, y 2 (0) = 13 


7. yi = -14y, + 10y 2 + 162 
y 2 = -5y t + y 2 - 324r 

8. yi = lOyj — 6y 2 + 10(1 — t — i 2 ) 
y 2 = 6)’! — 10y 2 + 4 — 20/ — 6/ 2 

9. yi = -3)’! - 4y 2 + 1 1/ + 15 

)’2 = 5y, + 6y 2 + 3e~ l - 15r - 20 


10. CAS EXPERIMENT. Undetermined Coefficients. 
Find out experimentally how general you must choose 
y (p) , in particular when the components of g have a 
different form (e.g., as in Prob. 9). Write a short report, 
covering also the situation in the case of the 
modification rule. 


1 1-16 


INITIAL VALUE PROBLEM 


Solve (showing details): 

11. y\ = — 2y 2 + 4/ 

y L = — 2 1 

yi{0) = 4, y 2 (0) = £ 


12. y[ = 4y 2 + 5e l 


y 2 = -y\ - 20<? * 


yi(0) = i » 3' 2 (0) = o 
13. y[ =yi + 2y 2 + e 2t ~ 2 1 
y 2 = - y 2 + 1 + / 

>'i(0) = 1, y 2 ( 0) = -4 


17. (Network) Find the currents in Fig. 97 when R = 2.5 ft, 
L = 1 H, C = 0.04 F, E(t) = 845 sin t V, and 4(0) = 0, 
4(0) = 0* (Show the details.) 

18. (Network) Find the currents in Fig. 97 when R = 1 ft, 
L = 10 H, C = 1.25 F, £(/) = 10 kV, and 4(0) = 0, 
4(0) = 0. (Show the details.) 



Fig. 97. Network in Probs. 17, 18 


19. (Network) Find the currents in Fig. 98 when /4 = 2 ft, 
R 2 = 8 ft, L = 1 H, C = 0.5 F, E = 200 V. (Show the 
details.) 


L 



Fig. 98. Network in Prob. 19 

20. WRITING PROJECT. Undetermined Coefficients. 
Write a short report in which you compare the 
application of the method of undetermined coefficients 
to a single ODE and to a system of two ODEs, using 
ODEs and systems of your choice. 
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T UE 5 T I O N S AND PROBLEMS 


1. State some applications that can be modeled by systems 
of ODEs. 

2. What is population dynamics? Give examples. 

3. How can you transform an ODE into a system of ODEs? 

4. What are qualitative methods for systems? Why are they 
important? 

5. What is the phase plane? The phase plane method? The 
phase portrait of a system of ODEs? 

6. What is a critical point of a system of ODEs? How did 
we classify these points? 

7. What are eigenvalues? What role did they play in this 
chapter? 

8. What does stability mean in general? In connection with 
critical points? 

9. What does linearization of a system mean? Give an 
example. 

10. What is a limit cycle? When may it occur in mechanics? 


1 1 1-191 GENERAL SOLUTION. CRITICAL POINTS 

Find a general solution. Determine the kind and stability of 
the critical point. (Show the details of your work.) 

n - 12 - y'i = 9 >i 


>’2 = 16 ), 

13. y[ = v 2 

y'z = 6), - 5y 2 

15. y[ = 1 ,5.v ( - 6y 2 
y'n = -4.5)-, + 3.y 2 

17. y[ = 3y, + 2 y 2 
>’2 = 2), + 3 .v 2 


.vi = ) 2 

14. y' t = 3y, - 3y 2 
v 2 = 3y, + 3y 2 

16. yi = —3), - 2y 2 

y 2 = -2yi - 3) 2 

18. )J = 3), + 5) 2 
y'i = - 5)1 - 3)2 


19. )', = -), + 2)2 

V 2 = -2), - )2 


23. y\ = 4), + 3)2 + 2 

)a = -6), - 5)2 + 4<?~* 

24. )', = ), - 2)2 - sin / 
y 2 = 3)1 - 4)2 - cos t 

25. )!=), + 2)2 + t 2 
y 2 = 2), + ) 2 - / 2 

26. (Mixing problem) Tank T x in Fig. 99 contains initially 
200 gal of water in which 160 lb of salt are dissolved. 
Tank T 2 contains initially 100 gal of pure water. Liquid 
is pumped through the system as indicated, and the 
mixtures are kept uniform by stirring. Find the amounts 
of salt yi(t) and y 2 (0 in T x and T 2 , respectively. 



Fig. 99. Tanks in Problem 26 


27. (Critical point) What kind of critical point does y' = Ay 
have if A has the eigenvalues —6 and 1? 

28. (Network) Find the currents in Fig. 100, where 
/?! = 0.5 a R 2 = 0.7 a L y = 0.4 H, L* = 0.5 H, 
E = 1 kV = 1000 V, and /j(0) = 0, / 2 (0) = 0. 



*2 

Fig. 100. Network in Problem 28 


NONHOMOGENEOUS SYSTEMS 

Find a general solution. (Show the details.) 

20. yi = 3y 2 + 6 1 21. yi = y, + 2y 2 + e 2t 

v 2 = 12.VJ + 1 y 2 = -v 2 + 1.5e“ 2t 

22. yi = y x + y 2 + sin t 

y f 2 = 4yi + y 2 


29. (Network) Find the currents in Fig. 101 when R = 10 a 
L = 1 .25 H, C = 0.002 F, and / x (0) = / 2 ( 0) = 3 A. 



Fig. 101. Network in Problem 29 
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30-33 


LINEARIZATION 


Determine the location and kind of all critical points of the 
given nonlinear system by linearization. 


30. y[ = y 2 

J’2 = 4j’i - Ji 3 


31. y[ = — 9y 2 


y 2 = sin yj 


32. y[ = cos>* 2 

y 2 - 3y, 


33. yj = y 2 - 2y 2 2 

3-2 = 3’i - 2j’i 2 


SUMMARY OECHARTEHE4 “ 

Systems of ODEs. Phase Plane. Qualitative Methods 


Whereas single electric circuits or single mass-spring systems are modeled by single 
ODEs (Chap. 2), networks of several circuits, systems of several masses and springs, 
and other engineering problems lead to systems of ODEs, involving several unknown 
functions yi(t\ • • • , y n (t). Of central interest are first-order systems (Sec. 4.2): 

}’i = fi(U .Vi, ’ * • , >’ n ) 

y' = f (r, y), in components, ; 

yh = fniu y 1? • • • , y n ), 

to which higher order ODEs and systems of ODEs can be reduced (Sec. 4.1). In 
this summary we let n = 2, so that 


, = fi(t , y v y 2 ) 

(1) y = f(f, y), in components, 

yk = v 2 ) 

Then we can represent solution curves as trajectories in the phase plane (the 
.Vi 3 ? 2 ’Pl ane )» investigate their totality [the “phase portrait” of (1)], and study the 
kind and stability of the critical points (points at which both and f 2 are zero), 
and classify them as nodes, saddle points, centers, or spiral points (Secs. 4.3, 4.4). 
These phase plane methods are qualitative; with their use we can discover various 
general properties of solutions without actually solving the system. They are 
primarily used for autonomous systems, that is, systems in which t does not occur 
explicitly. 

A linear system is of the form 


(2) y' = Ay -H g, 


r a u 

a 12~ 


"Vi” 

r*r 

where A = 

> 

y = 


> g = 

- a 21 

a 22- 


J’2. 

L&2_ 


If g = 0, the system is called homogeneous and is of the form 


(3) 


y' = Ay. 






Summary of Chapter 4 


165 


If a n , * • * , a 22 are constants, it has solutions y = xe xt , where A is a solution of the 
quadratic equation 



a l2 

a 22 ~~ ^ 


— (^11 ^)(#22 A ) ^ 12 ^ 2 i — 0 


and x =£ 0 has components jc lt x 2 determined up to a multiplicative constant by 


(a u - X)xi + a 12 x 2 = 0. 


(These A’s are called the eigenvalues and these vectors x eigenvectors of the matrix 
A. Further explanation is given in Sec. 4.0.) 

A system (2) with g ¥= 0 is called nonhomogeneous. Its general solution is of 
the form y = y h 4- y p , where y h is a general solution of (3) and y p a particular 
solution of (2). Methods of determining the latter are discussed in Sec. 4.6. 

The discussion of critical points of linear systems based on eigenvalues is 
summarized in Tables 4.1 and 4.2 in Sec. 4.4. It also applies to nonlinear systems 
if the latter are first linearized. The key theorem for this is Theorem 1 in Sec. 4.5, 
which also includes three famous applications, namely the pendulum and van der 
Pol equations and the Lotka-Volterra predator-prey population model. 





CHAPTER 5 

Series Solutions of ODEs. 
Special Functions 


In Chaps. 2 and 3 we have seen that linear ODEs with constant coefficients can be solved 
by functions known from calculus. However, if a linear ODE has variable coefficients 
(functions of x\ it must usually be solved by other methods, as we shall see in this 
chapter. 

Legendre polynomials, Bessel functions, and eigenfunction expansions are the three 
main topics in this chapter. These are of greatest importance to the applied mathematician. 

Legendre’s ODE and Legendre polynomials (Sec. 5.3) are likely to occur in problems 
showing spherical symmetry . They are obtained by the power series method (Secs. 5. 1 , 
5.2), which gives solutions of ODEs in power series. 

Bessel’s ODE and Bessel functions (Secs. 5.5, 5.6) are likely to occur in problems 
showing cylindrical symmetry. They are obtained by the Frobenius method (Sec. 5.4), 
an extension of the power series method which gives solutions of ODEs in power series, 
possibly multiplied by a logarithmic term or by a fractional power. 

Eigenfunction expansions (Sec. 5.8) are infinite series obtained by the Sturm- 
Liouville theory (Sec. 5.7). The terms of these series may be Legendre polynomials or 
other functions, and their coefficients are obtained by the orthogonality of those functions. 
These expansions include Fourier series in terms of cosine and sine, which are so 
important that we shall devote a whole chapter (Chap. 1 1) to them. 

Special functions (also called higher functions) is a name for more advanced functions 
not considered in calculus. If a function occurs in many applications, it gets a name, and 
its properties and values are investigated in all details, resulting in hundreds of formulas 
which together with the underlying theory often fill whole books. This is what has 
happened to the gamma, Legendre, Bessel, and several other functions (take a look into 
Refs. [GR1], [GR10], [A1 1] in App. 1). 

Your CAS knows most of the special functions and corresponding formulas that you 
will ever need in your later work in industry, and this chapter will give you a feel for the 
basics of their theory and their application in modeling. 

COMMENT. You can study this chapter directly after Chap. 2 because it needs no 
material from Chaps. 3 or 4. 

Prerequisite: Chap. 2. 

Sections that may be omitted in a shorter course: 5.2, 5.6-5. 8. 

References and Answers to Problems: App. 1 Part A, and App. 2. 
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SEC 5.1 Power Series Method 

5.1 Power Series Method 

The power series method is the standard method for solving linear ODEs with variable 
coefficients. It gives solutions in the form of power series. These series can be used for 
computing values, graphing curves, proving formulas, and exploring properties of solutions, 
as we shall see. In this section we begin by explaining the idea of the power series method. 

Power Series 

From calculus we recall that a power series (in powers of x — *o) is an infinite series of 
the form 


( 1 ) 2 a m( x ~ x o) m = a 0 + ax(x - A' 0 ) + a 2 (x - x 0 f + • • • . 

m= 0 

Here, jc is a variable. a 0f a l9 a 2i • • * are constants, called the coefficients of the series. 
x 0 is a constant, called the center of the series. In particular, if x 0 = 0, we obtain a power 
series in powers of x 


(2) X am*™ = a 0 + a l x + a 2fi 2 + + ’ • * • 

m= 0 


We shall assume that all variables and constants are real. 
Familiar examples of power series are the Maclaurin series 


1 


1 — x 


= 2 X m = l +x + x 2 + 


(1*1 < 1, geometric series) 


m=0 


<» Y m r 2 „3 

i A A A 

e =2 0 ^ = 1+ * + 3T + 3t + 


* 2 * 4 


^ (-ir* 2m . .. . .. 

cos ^ = X = 1 — — + — — + 

to ( 2 «) ! 2! 4! 


00 

sin x = X 

m 


(-l ) m * 2m+1 * 3 JC 5 

(2m + 1)! “ A 3! + 5! + 


We note that the term ‘‘power series” usually refers to a series of the form (1) [or (2)] 
but does not include series of negative or fractional powers of x. We use m as the 
summation letter, reserving n as a standard notation in the Legendre and Bessel equations 
for integer values of the parameter. 

Idea of the Power Series Method 

The idea of the power series method for solving ODEs is simple and natural. We describe 
the practical procedure and illustrate it for two ODEs whose solution we know, so that 
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EXAMPLE 1 


we can see what is going on. The mathematical justification of the method follows in the 
next section. 

For a given ODE 

/ + P(x)y' + q(x)y = 0 

we first represent p(x) and q(x) by power series in powers of x (or of x — a 0 if solutions 
in powers of x — a 0 are wanted). Often p(x) and q(x) are polynomials, and then nothing 
needs to be done in this first step. Next we assume a solution in the form of a power series 
with unknown coefficients, 

OC 

(3) y = 2 a m x m = flo + + ap 2 + a z x z + • • • 

w= 0 

and insert this series and the series obtained by termwise differentiation, 

oo 

y = X mcl m xm ~ X = o 1 + 2a 2 x 4- 3 a z x 2 4- • • • 

tn=l 
00 

y" = 2 m O n ~ 1 )fl w x m “ 2 = 2 a 2 4- 3 • 2 a 3 x 4- 4 • 3 a 4 x 2 4- • • • 

m= 2 


(a) 
(4) 

(b) 


into the ODE. Then we collect like powers of a- and equate the sum of the coefficients of 
each occurring power of x to zero, starting with the constant terms, then taking the terms 
containing a, then the terms in a 2 , and so on. This gives equations from which we can 
determine the unknown coefficients of (3) successively. 

Let us show this for two simple ODEs that can also be solved by elementary methods, 
so that we would not need power series. 

Solve the following ODE by power series. To grasp the idea, do this by hand; do not use your CAS (for 
which you could program the whole process). 

y = 2vv. 

Solution . We insert (3) and (4a) into the given ODE, obtaining 


a i 4 lci 2 x 4- 3rt 3 A 2 4 • • • = 2v(f/Q 4 a* 4 4 • • •). 


We must perform the multiplication by 2v on the right and can write the resulting equation conveniently as 

4 2a 2 x 4 3a 3 x 2 4 4a 4 x 3 4 Sa&x 4 4 6 a&x 5 4 • ♦ • 

= 2a 0 x 4 lap? 4 2^2-V 3 + lap? 4 2^v 5 4 • * • . 


For this equation to hold, the two coefficients of every power of x on both sides must be equal, that is, 
a\ — 0 , la 2 = 2 fl 0 . 3r?3 = 2a ^ 4 — 2 a 2 , 5a 5 = 2 a& 6 a$ = 2 • • ■ . 

Hence a 3 = 0, = 0, - • • and for the coefficients with even subscripts. 


_ a 2 a 0 gq 

^ a *~ ~ = IT • ° 6= T = 3T ’ " ’ ; 
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EXAMPLE 2 


a 0 remains arbitrary. With these coefficients the series (3) gives the following solution, which you should confirm 
by the method of separating variables. 


/ 2 a* 4 .v 6 a 8 \ 2 

>• = «o (> + -v + 2 r + 3i" + 4r + “/ = a ° e • 

More rapidly, (3) and (4) give for the ODE y* = 2 xy 


1 -a x x° + 2 ma m x m 1 = 2-v X “m*” 1 = 2 2« m A m+1 . 

m*=2 m“0 m=0 


Now, to get die same general power on both sides, we make a “shift of index” on the left by setting m = s 4- 2, 
thus m - 1 = s + 1. Then a m becomes a s +2 and .v m-1 becomes .v* +1 . Also the summation, which started with 
m = 2. now starts with s = 0 because s = m — 2. On the right we simply make a change of notation m = $, 
hence a m = a s and A m+1 = .v s+1 ; also the summation now starts with s = 0. This altogether gives 

OC DO 

a i + 2 I* + 2V/ 5+ 2 a ' S+1 = 2 2 ^ s a s+1 . 

* = 0 4i = 0 


Every occurring power of a must have the same coefficient on both sides; hence 


2 

o x = 0 and ( s + 2 )a s+2 = 2a s or a s+2 = + 2 fig. 


For 5 = 0, 1. 2, • • • we thus have a 2 = (2/2 )« 0 , tf 3 = (2/3)a x = 0, a 4 = (2/4)« 2 , * * • as before. I 


Solve 

Solution. 


/ + y = 0. 

By inserting (3) and (4b) into the ODE we have 

OC OC 

2 '”(/» - l)n w .r m “ 2 -1- 2 a rrt A " ,n = 0. 

m=2 7n=0 


To obtain the same general power on both series, we set m = a- + 2 in the first series and m = s in the second, 
and then we take the latter to the right side. This gives 

OC OC 

2(J + 2 )(J + = - 2 

s-Q s«0 

Each power .v $ must have the same coefficient on both sides. Hence ( s + 2)(.v + l)«^^. 2 = -a s . This gives the 
recursion formula 


a S+ 2 ~ 


(s + 2 )(s + 1) 


We thus obtain successively 

*2 = 

a 4 = - 


fl o 
2* l 

4*3 


fo 

u 


fo 

4! 


^3 = “ 


«5 = - 


01 

3-2 

fl 3 

5-4 


3! 


5! 


and so on. a 0 and a x remain arbitrary. With these coefficients the series (3) becomes 

a o o a i •* a o a. a i *i 
>• = a 0 + «,.v - 2 j- a- “ 3T a 3 + a 4 + — .r 5 + • • • 


(A = o, I, • • •). 
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Reordering terms (which is permissible for a power series), we can write this in the form 



and we recognize the familiar general solution 


y = a 0 cos x + a 1 sin x. 


Do we need the power series method for these or similar ODEs? Of course not; we used 
them just for explaining the idea of the method. What happens if we apply the method 
to an ODE not of the kind considered so far, even to an innocent-looking one such as 
y" + xy = 0 (“Airy’s equation”)? We most likely end up with new special functions given 
by power series. And if such an ODE and its solutions are of practical (or theoretical) 
interest, we name and investigate them in terms of formulas and graphs and by numeric 
methods. 

We shall discuss Legendre’s, Bessel’s, and the hypergeometric equations and their 
solutions, to mention just the most prominent of these ODEs. To do this with a good 
understanding, also in the light of your CAS, we first explain the power series method 
(and later an extension, the Frobenius method) in more detail. 




[wo] POWER SERIES METHOD: TECHNIQUE, 
FEATURES 


Apply the power series method. Do this by hand, not by a 
CAS, so that you get a feel for the method, e.g., why a 
series may terminate, or has even powers only, or has no 
constant or linear terms, etc. Show the details of your work. 


1. y' - y = 0 
3. y" + Ay = 0 
5. (2 + x)y f = y 
7. y f = y + x 


2. / + xy = 0 

4. /' - y = 0 

6. / + 3(1 + a* 2 )v = 0 

8. (.v 5 + 4.v 3 )/ = (5 a- 4 + 12A- 2 )y 


9 . /' - / = 0 10. /' - a-/ + y = 0 

1 11-16 1 CAS PROBLEMS. INITIAL VALUE 
PROBLEMS 


Solve the initial value problems by a power series. Graph 
the partial sum s of the powers up to and including a 5 . Find 
the value of s (5 digits) at x v 


11. / + 4y = 1, y(0) = 1.25, a, = 0.2 

12. / = 1 + y z , .y(O) = 0, jc, = \ir 

13. / = y - )- 2 , >>(0) = i .v, = 1 

14. (jc - 2)/ = xy, y(0) = 4, x 1 = 2 

15. y" + 3xy' + 2y = 0, y(0) = 1, 

/( 0) = 1, A-! = 0.5 

16. (1 - x 2 )y" - 2xy ' + 30y = 0, >(0) = 0, 

/( 0) = 1.875, a-! = 0.5 

17. WRITING PROJECT. Power Series. Write a review 
(2-3 pages) on power series as they are discussed in 
calculus, using your own formulation and examples — 
do not just copy passages from calculus texts. 

18. LITERATURE PROJECT. Maclaurin Series. 
Collect Maclaurin series of the functions known from 
calculus and arrange them systematically in a list that 
you can use for your work. 


5.2 Theory of the Power Series Method 

In the last section we saw that the power series method gives solutions of ODEs in die 
form of power series. In this section we justify the method mathematically as follows. We 
first review relevant facts on power series from calculus. Then we list the operations on 
power series needed in the method (differentiation, addition, multiplication, etc.). Near 
the end we state the basic existence theorem for power series solutions of ODEs. 
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Basic Concepts 

Recall from calculus that a power series is an infinite series of the form 


(1) 2 a m(x - X 0 ) m = a 0 + a^x - x 0 ) + a 2 (x - x 0 f + • • • 
m-0 

As before, we assume the variable x , the center x 0t and the coefficients a 0 , a lf • • • to be 
real. The nth partial sum of (1) is 

(2) s n (x) = ci 0 + a x (x - x 0 ) + a 2 (x - x 0 ) 2 + • • • + ajjc - x 0 ) n 

where n = 0, 1, ■ • • . Clearly, if we omit the terms of s n from (1), the remaining expression 
is 

(3) R n (x) = a n+1 (x - x 0 ) n+1 + a n+2 (x - x 0 ) n+z + • • • . 

This expression is called the remainder of( 1) after the term a n (x — x 0 ) n . 

For example, in the case of the geometric series 

1 + x + x 2 + • • • + + • • • 

we have 

s 0 = 1, R 0 = x + x 2 + x 3 + • • ■ , 

s x = 1 + x, Ri — x 2 + x 3 + x 4 + • • • , 

s 2 — 1 + x + x 2 , R 2 = x 3 + x 4 + x s +••• , etc. 

In this way we have now associated with (1) the sequence of the partial sums 
s 0 (x), $ 2 (*)> • • • . If for some x = x x this sequence converges, say, 

lim s n (x{) = sC*!), 

n -+ co 


then the series (1) is called convergent at x = x lt the number s(x{) is called the value or 
sum of (I) at jtj, and we write 


0 © 

s(*l) = 2 «m(*l - X 0 ) m . 
m= 0 


Then we have for every n, 

(4) s(xx) = s n (x a ) + 

If that sequence diverges at x = x lt the series (1) is called divergent at x = Xi. 

In the case of convergence, for any positive e there is an N (depending on e) such that, 
by (4), 


(5) 


\R n (Xl)\ = - sjx i)| < € 


for all n > N. 
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Geometrically, this means that all s^Xx) with n > N lie between ^(a^) — e and + e 
(Fig. 102). Practically, this means that in the case of convergence we can approximate 
the sum s(ax) of (1) at a* by s n (xx) as accurately as we please, by taking n large enough. 

Convergence Interval. Radius of Convergence 

With respect to the convergence of the power series (1) there are three cases, the useless 
Case 1, the usual Case 2, and the best Case 3, as follows. 

Case 1 . The series ( 1 ) always converges at a* = a* 0 , because for x = a 0 all its terms are 
zero, perhaps except for the first one, a Q . In exceptional cases x = x 0 may be the only x 
for which (1) converges. Such a series is of no practical interest. 

Case 2. If there are further values of a* for which the series converges, these values form 
an interval, called the convergence interval. If this interval is finite, it has the midpoint 
x 0 , so that it is of the form 


(6) |a* - x 0 | < R (Fig. 103) 

and the series (1) converges for all x such that |x — x 0 | < R and diverges for all a* such 
that |a* - A‘ 0 | > R . (No general statement about convergence or divergence can be made 
for a* - x 0 = R or —R.) The number R is called the radius of convergence of (1). (R is 
called “radius” because for a complex power series it is the radius of a disk of convergence.) 
R can be obtained from either of the formulas 

(7) (a) R= l/lim V^J (b) R = l/lim 

/nwoc a m 

provided these limits exist and are not zero. [If these limits are infinite, then (1) converges 
only at the center x 0 .] 

Case 3. The convergence interval may sometimes be infinite, that is, (1) converges for 
all a*. For instance, if the limit in (7a) or (7b) is zero, this case occurs. One then writes 
R = so, for convenience. (Proofs of all these facts can be found in Sec. 15.2.) 

For each a* for which ( 1 ) converges, it has a certain value s(x). We say that ( 1 ) represents 
the function s(x) in the convergence interval and write 

CO 

*(•*) = 2 a m {x - x 0 ) m (|„v - * 0 | < R). 

m — 0 


Let us illustrate these three possible cases with typical examples. 



s(.Vj) - e stej) s(Xj ) + e 

Fig. 102. Inequality (5) 


Divergence 

Conve 

rgence 

Divergence 



• /i - 



•*o "" ^ * v o X Q + ^ 

Fig. 103. Convergence interval (6) of a power 
series with center x 0 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


EXAMPLE 4 


The Useless Case 1 of Convergence Only at the Center 

In the case of die series 

00 

2 m\x m = I + x + 2.v 2 + 6* 3 + • • • 
m=0 


we have a m = ml and in (7b), 


1 


{m + 1)! 
w! 


= m + I 


00 


Thus this series converges only at the center x = 0. Such a series is useless. 


as m — > oc. 


The Usual Case 2 of Convergence in a Finite Interval. Geometric Series 

For the geometric series we have 

1 00 

— — = X * TO = i + a- + x 2 + • • • (M < i). 

m=0 

In fact, a m = 1 for all m, and from (7) we obtain R = 1 , that is, the geometric series converges and represents 
1/(1 — x ) when | a *| < 1 . H 


The Best Case 3 of Convergence for All x 


In the case of the series 



we have a m = 1/m!. Renee in (7b), 


I *1* x + 


2! 


q m+l = 1/Q« + 1)1 = 1 

1/m! m -1- 1 


as m 


00, 


so that the series converges for all a*. 


Hint for Some of the Problems 

Find the radius of convergence of the series 

00 / n m 

V 1 1 j v 3 m . 

^ o m Y 


.v 3 .v 6 .v 9 

l_ T + 6T _ 5l2 +_ 


Solution . This is a series in powers of / = a 3 with coefficients a m = (- I) w /8”\ so that in (7b), 


1 


qW+1 


Thus R = 8. Hence the series converges for |/| = |a* 3 | < 8, that is, |a| < 2. 


Operations on Power Series 

In the power series method we differentiate, add, and multiply power series. These three 
operations are permissible, in the sense explained in what follows. We also list a condition 
about the vanishing of all coefficients of a power series, which is a basic tool of the power 
series method. (Proofs can be found in Sec. 15.3.) 
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Termwise Differentiation 

A power series may be differentiated term by term. More precisely: if 


00 

y(x) = 2 a m(x - X 0 r 

m=0 


converges for \x - x 0 \ < /?, where R > 0, then the series obtained by differentiating term 
by term also converges for those x and represents the derivative y f of y for those x, 
that is, 

oc 

/(*) = 2 >na m {x ~ (|* - x 0 \ < R). 

m*=l 

Similarly, 

oc 

y"(x) = 2 1,1 ( m ~ Oa TO C* - x 0 ) m ~ 2 (I* - x 0 \ < R), etc. 

m=2 


Termwise Addition 

Two power series may be added term by term. More precisely: if the series 
(8) 2 a m(x ~ x 0 ) m and 2 M* ~ x 0 ) m 

m=0 m=0 

have positive radii of convergence and their sums are f(x) and g(x), then the series 


oc 

2 («w + b m ){x - x 0 ) m 

m=0 

converges and represents f(x) + g(x) for each x that lies in the interior of the convergence 
interval of each of the two given series. 

Termwise Multiplication 

Two power series may be multiplied term by term. More precisely: Suppose that the series 
(8) have positive radii of convergence and let /( x) and g(x) be their sums. Then the 
series obtained by multiplying each term of the first series by each term of the second 
series and collecting like powers of x - ,v 0 > that is, 

oc 

ifiobm "I” ai&m-l &vibo)(x Xq) 

m = 0 

= a 0 bo -I- ( a 0 b x + a x b 0 )(a* - x 0 ) + (a 0 b 2 + a x b x + a 2 b 0 )(x - x 0 ) 2 + • • • 

converges and represents f(x)g(x) for each x in the interior of the convergence interval of 
each of the two given series. 
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DEFINITION 


THEOREM 1 


Vanishing of All Coefficients 

If a power series has a positive radius of convergence and a sum that is identically zero 
throughout its interval of convergence, then each coefficient of the series must be zero. 

Existence of Power Series Solutions of ODEs. 

Real Analytic Functions 

The properties of power series just discussed form the foundation of the power series 
method. The remaining question is whether an ODE has power series solutions at all. An 
answer is simple: If the coefficients p and q and the function r on the right side of 

(9) v" + p(x)y' + q(x)y = r(x) 

have power series representations, then (9) has power series solutions. The same is true 
if It, p , q , and ? in 

(10) h(x)y" + p(x)y' + q{x)y = r(x) 

have power series representations and h(x 0 ) =£ 0 (x 0 the center of the series). Almost all 
ODEs in practice have polynomials as coefficients (thus terminating power series), so that 
(when r(x) = 0 or is a power series, too) those conditions are satisfied, except perhaps 
the condition K(x 0 ) ^ 0. If K(x 0 ) # 0, division of (10) by K(x) gives (9) with p = pfh , 
q = q/Ti , r = Tilt. This motivates our notation in (10). 

To formulate all this in a precise and simple way, we use the following concept (which 
is of general interest). 


Real Analytic Function 

A real function f(x) is called analytic at a point x = jc 0 if it can be represented by 
a power series in powers of x — x 0 with radius of convergence R > 0. 


Using this concept, we can state the following basic theorem. 


Existence of Power Series Solutions 

If p, q, and r in (9) are analytic at x = a 0 , then every solution of (9) is analytic at 
x = A' 0 and can thus be represented by a power series in powers of x — x 0 with 
radius of convergence R > 0. Hence the same is true if h, p, q, and r in ( 1 0) are 
analytic at x = a* 0 and lt(x 0 ) # 0. 


The proof of this theorem requires advanced methods of complex analysis and can be 
found in Ref. [All] listed in App. 1. 

We mention that the radius of convergence R in Theorem 1 is at least equal to the 
distance from the point a* = x 0 to the point (or points) closest to a 0 at which one of the 
functions p , q , r, as functions of a complex variable , is not analytic. (Note that that point 
may not lie on the x-axis but somewhere in the complex plane.) 
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PROBLEM SET 5.2 


M2] RADIUS OF CONVERGENCE 

Determine the radius of convergence. (Show the details.) 


1- 2 75T (c * 0) 

A ^ 


2 ‘ ^ 3"'(m + If ( * + ,)2 ” 


3. f <2L±J>2 ( , - 

>17 = 1 L 


4. 2 ( - 1 )”‘.r 4m 


5 f v , 

(2m + 2)(2m + 4) ’ 


CO /IXttt 

, ^ 1 IL 2tn-f 10 

~ ( m !) 2 V 

m=0 v 7 
oc / iyn 

7. 2 - l ) 2m 

m«2 


8 y 

*• ~ (,«!) 4 

W= 1 v 


(m + 3) 2 
Cm - 3) 4 


io. 2 


11. 2 -sr (.v - i*)" 

W = 1 

’y ( m 2m+l 

(2/» + 1)! 

13-15 1 SHIFTING SUMMATION INDICES 
(CF. SEC. 5.1) 

This is often convenient or necessary in the power series 
method. Shift the index so that the power under the 
summation sign is x s . Check by writing the first few terms 
explicitly. Also determine the radius of convergence R . 

13. 2 — -v" +2 


oc / i vm+1 

i4 y - — - — 

Am 

m~3 


15 V - V P + 4 

<*>+■» 

16-23[ POWER SERIES SOLUTIONS 

Find a power series solution in powers of x. (Show the 
details of your work.) 

16. y" + xy = 0 

17. y” - y' + x 2 y = 0 

18. y" — y' + xy = 0 

19. y" + 4 xy' = 0 

20. y" + 2 xy' + y = 0 

21. y" + (I + ,v 2 )y = 0 

22. y" - 4 xy' + (4x 2 - 2)y = 0 

23. (2x 2 - 3.v + l)y" + 2xy' - 2y = 0 

24. TEAM PROJECT. Properties from Power Series. 
In the next sections we shall define new functions 
(Legendre functions, etc.) by power series, deriving 
properties of the functions directly from the series. To 
understand this idea, do the same for functions familiar 
from calculus, using Maclaurin series. 

(a) Show that cosh* + sinhx = e x . Show that 
cosh* > 0 for all x. Show that e x ^ e~ x for all 
* ^ 0 . 

(b) Derive the differentiation formulas for e x , cos x, 
sin*, 1/(1 — x) and other functions of your choice. 
Show that (cos*) ,/ = —cos*, (cosh*)” = cosh*. 
Consider integration similarly. 

(c) What can you conclude if a series contains only 
odd powers? Only even powers? No constant term? If 
all its coefficients are positive? Give examples. 

(d) What properties of cos * and sin * are not obvious 
from the Maclaurin series? What properties of other 
functions? 

25. CAS EXPERIMENT. Information from Graphs of 
Partial Sums. In connection with power series in 
numerics we use partial sums. To get a feel for the 
accuracy for various x, experiment with sin* and 
graphs of partial sums of the Maclaurin series of an 
increasing number of terms, describing qualitatively 
the “breakaway points” of these graphs from the 
graph of sin x. Consider other examples of your own 
choice. 
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5.3 Legendre's Equation. 

Legendre Polynomials P n (x) 

In order to first gain skill, we have applied the power series method to ODEs that can 
also be solved by other methods. We now turn to the first “big” equation of physics, for 
which we do need the power series method. This is Legendre’s equation 1 

(1) (1 - x 2 )y " - 2 xy' + n(n + l)y = 0 


where n is a given constant. Legendre’s equation arises in numerous problems, particularly 
in boundary value problems for spheres (take a quick look at Example I in Sec. 12.10). 
The parameter n in (1) is a given real number. Any solution of (1) is called a Legendre 
function. The study of these and other “higher” functions not occurring in calculus is 
called the theory of special functions. Further special functions will occur in the next 
sections. 

Dividing (1) by the coefficient 1 — jc 2 of y\ we see that the coefficients — 2 jc/(1 — x 2 ) 
and n(n 4 - l)/(l — x 2 ) of the new equation are analytic at x = 0. Hence by Theorem 1 , 
in Sec. 5.2, Legendre’s equation has power series solutions of the form 

00 

(2) y = S 

m=0 

Substituting (2) and its derivatives into (1), and denoting the constant n(n + 1) simply by 
k , we obtain 


CO CO GO 

(1 “ X 2 ) 2 ~ l)tf, n* m ~ 2 ” 2x 2 ^a m X m ~ 1 + k 2 a m x7TL ~ 0. 

m = 2 m= 1 m — 0 

By writing the first expression as two separate series we have the equation 

00 00 oo oo 

2 m ( m — 1 )a m x m ~ 2 “2 m ( m ~ “ 2 2ma m x m + 2 k a m x m = 0. 

m= 2 m=2 m=l m = 0 

To obtain the same general power x s in all four series, we set m — 2 = s (thus m = s + 2) 
in the first series and simply write s instead of m in the other three series. This gives 

OO 00 oo oo 

2 (s + 2)(5 + l)a s+2 * s — 2 s ( s ~ 1 )a s xS ~ 2 2 sa s x s + 2 = 0. 

s=>0 2 s=l s=0 


1 ADRIEN-MARIE LEGENDRE (1752-1833), French mathematician, who became a professor in Paris in 
1775 and made important contributions to special functions, elliptic integrals, number theory, and the calculus 
of variations. His book Elements de geomdtne (1794) became very famous and had 12 editions in less than 30 
years. 

Formulas on Legendre functions may be found in Refs. [GR1J and [GRIO]. 
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(Note that in the first series the summation begins with ^ = 0.) Since this equation with 
right side 0 must be an identity in x if (2) is to be a solution of (1), the sum of the 
coefficients of each power of a* on the left must be zero. Now a 0 occurs in the first and 
fourth series and gives [remember that k = n(n + 1)] 

(3a) 2 • la 2 4- n(n 4 - l)tf 0 = 0. 

a 1 occurs in the first, third, and fourth series and gives 

(3b) 3 • 2a 3 +[— 2 4- n(n -1- l)]r/j = 0. 

The higher powers a 2 , a 3 , • • • occur in all four series and give 

(3c) ( s 4- 2 )(s 4- l)a s + 2 4* [—$($ “ 1) — 2s 4- n(n 4- l)]a s = 0. 

The expression in the brackets [• • •] can be written (;? — s)(n 4 j 4 1), as you may 
readily verify. Solving (3a) for ci 2 and (3b) for a 3 as well as (3c) for a s + 2 , we obtain the 
general formula 


( 4 ) 


( n — s)(n 4- .v 4- 1) 

(s + 2)(s + 1) ° s 


(s = 0, 1, • • •)- 


This is called a recurrence relation or recursion formula. (Its derivation you may verify 
with your CAS.) It gives each coefficient in terms of the second one preceding it, except 
for a 0 and which are left as arbitrary constants. We find successively 


a 2 — 


n(n +1) 
2\~ 


a o 


{n - 2){n 4- 3) 

« 4 - 4 . 3 ® 2 

in - 2)n(n + 1)(h + 3) 
4! 


a 0 


a 3 = “ 


(« - l)(/i + 2) 

IT 

(n — 3 )(« + 4) 


a i 


fl5_ 5-4 

(n - 3)(n - l)in + 2)in + 4) 
5! 




and so on. By inserting these expressions for the coefficients into (2) we obtain 

(5) .vM = a 0 )'i« + «iy 2 W 
where 

n(n +1) in - 2)n(n + 1)(« + 3) 

(6) yi (x) = l x 2 + -v 4 - + • • • 

in ~ D(n + 2) in - 3 )(« - l)(/i + 2 )(/j + 4) 

(7) y 2 (x) = x x 3 + x 5 - + • • • . 


These series converge for |x| < 1 (see Prob. 4; or they may terminate, see below). Since 
(6) contains even powers of x only, while (7) contains odd powers of x only, the ratio 
yj /^2 is not a constant, so that y x and y z are not proportional and are thus linearly 
independent solutions. Hence (5) is a general solution of (1) on the interval -1 < x < 1. 
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Legendre Polynomials P n (x) 

In various applications, power series solutions of ODEs reduce to polynomials, that is, 
they terminate after finitely many terms. This is a great advantage and is quite common 
for special functions, leading to various important families of polynomials (see Refs. [GR1] 
or [GR10] in App. 1). For Legendre’s equation this happens when the parameter n is a 
nonnegative integer because then the right side of (4) is zero for s = n, so that a n + 2 — 0, 
ci n + 4 = 0, a n + 6 = 0, • • • . Hence if n is even, y^.*) reduces to a polynomial of degree n. 
If n is odd, the same is true for y 2 W- These polynomials, multiplied by some constants, 
are called Legendre polynomials and are denoted by P n (x). The standard choice of a 
constant is done as follows. We choose the coefficient a n of the highest power x n as 

(2/z) ! 1 -3-5 • • • (2n - 1) 

(8) a n = ^ = — (« a positive integer) 

(and a n = 1 if n — 0). Then we calculate the other coefficients from (4), solved for a $ in 
terms of a s + 2 , that is, 


(9) 




(5 + 2)(J + 1) 

(n — s)(n + s + 1) 


«S+2 


(tSn- 2). 


The choice (8) makes P n (l) = 1 for every n (see Fig. 104 on p. 180); this motivates (8). 
From (9) with s = n — 2 and (8) we obtain 


a n-2 ~ 


«(» ~ 1 ) 

2 ( 2 n - 1 ) 


a„ = - 


n(n - l)(2n)! 

2(2 n - l)2 n (n!) 2 ' 


Using (2n)l = 2n(2n — 1)(2 n — 2)!, n\ = n(n — 1)1, and h! = n(n — 1 )(n — 2)!, we 
obtain 


n(n — l)2n(2n — l)(2w — 2)1 
2(2 n ~ l)2 n /i(n - 1)1 n(n - 1 )(n - 2)1 * 


n(n — \)2n(2n — 1) cancels, so that we get 


- (2” ~ 2)1 
a ”“ 2 “ 2 : tt (n - 1)! (n - 2)! ' 

Similarly, 

(n - 2)(n - 3) 

fln " 4 4(2 n ~ 3) ° n - 2 

(2 n - 4)! 

“ 2^2! (» - 2)! (/i - 4)1 
and so on, and in general, when n — 2m ^ 0, 


^n-2m = (-ir 


(2 n — 2m ) ! 

2 n m! (/? — m)\ (n — 2 m)l 


GO) 
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The resulting solution of Legendre’s differential equation (1) is called the Legendre 
polynomial of degree n and is denoted by P n (x). 

From (10) we obtain 


( 11 ) 


M 


Pn(x) = 2 (“I)" 

m= 0 


(2 n — 2m)! 


2 n m! (n - m)! (n - 2m)! 


v n-2m 


= (2/i)! (2 « - 2)! 

2 n (/i!) 1 2 * 2”1! (n - 1)! (n - 2)! 


where M = n/2 or (n — l)/2, whichever is an integer. The first few of these functions are 
(Fig. 104) 

Port = 1, Pi(x) = x 

(11') Pz(x) = hOx 2 - 1), P z (x) = |(5x 3 - 3x) 

F 4 (x) = £(35x 4 - 30x 2 + 3), P 5 (x) = J(63x 5 - 70x 3 + 15x) 


and so on. You may now program (11) on your CAS and calculate P n (x) as needed. 

The so-called orthogonality of the Legendre polynomials will be considered in 
Secs. 5.7 and 5.8. 



1. Verily that the polynomials in (1 1 ') satisfy Legendre’s 
equation. 

2. Derive (IT) from (11). 

3. Obtain P 6 and P 7 from (11). 

4. (Convergence) Show that for any n for which (6) or 
(7) does not reduce to a polynomial, the series has 

radius of convergence 1. 


5. (Legendre function Qq(x) for n = 0) Show that (6) 
with n = 0 gives y x (x) = P 0 (x) = 1 and (7) gives 


, 2 2 , (— 3 )(— 1 ) • 2 * 

y 2 (x) =x + — x* + 


x 5 + 



+ • • • 



1 

1 -x * 
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Verify this by solving (1) with n = 0, setting z = / 
and separating variables. 

6. (Legendre function — fii(jc) for n = 1) Show that (7) 
with n = 1 gives y 2 (.v) = P x (x) = x and (6) gives 
Vi(a) = — Q x (jr) (the minus sign in the notation being 
conventional). 


ViCv) = 




= 1 


A* I A* 





1 ~j~ A' 

1 - A * 


7. (ODE) Find a solution of 

(a 2 - x 2 )y" - 2xy' + n(n + l).y = 0, a * 0, 
by reduction to the Legendre equation. 

8. [Rodrigues’s formula (12)] 2 Applying the binomial 
theorem to (a 2 - 1)”, differentiating it n times term 
by term, and comparing the result with (11), show 
that 


( 12 ) 


^n( v ) 2 n n \ ( J x n ^ 1 * 


9. (Rodrigues’s formula) Obtain (1 1 ') from (12). 


(a) Legendre polynomials. Show that 


(13) 


G(m, x) = 


1 

Vl — 2 a m + m 2 


= 2 Pt&x)u n 

n-0 


is a generating function of the Legendre polyn omials . 
Hint: Start from the binomial expansion of 1/V 1 - u, 
then set v = 2am — m 2 , multiply the powers of 
2am — m 2 out. collect all the terms involving m w , and 
verify that the sum of these terms is P n (x)u n . 

(b) Potential theory. Let A x and A 2 be two points in 
space (Fig. 105, r 2 > 0). Using (13), show that 


1 

V/*x 2 + r 2 2 - 2f\r 2 cos 6 


1 °° 

= — 2 P m ( cos 0 ) 

m= 0 



This formula has applications in potential theory. 
( Q/r is the electrostatic potential at A 2 due to a 
charge Q located at A v And the series expresses 1/r 
in terms of the distances of A, and A 2 from any origin 
O and the angle 0 between the segments OA l and 
OA 2 .) 


CAS PROBLEMS 

10. Graph P 2 (x\ • • • . P^ 0 (x) on common axes. For what 
a (approximately) and n = 2, * • • , 10 is |P„(a)| < §? 

11. From what n on will your CAS no longer produce 
faithful graphs of P n (x)l Why ? 

12. Graph Q 0 W, Qi(x), and some further Legendre 
functions. 

13. Substitute a^x* + a s+l x s i l + a s+2 x s * 2 into Legendre’s 
equation and obtain the coefficient recursion (4). 

14. TEAM PROJECT. Generating Functions. 
Generating functions play a significant role in modern 
applied mathematics (see [GR5]). The idea is simple. 
If we want to study a certain sequence (/„(a)) and can 
find a function 

oc 

G(«, .v) = 2 /»(•<)«". 

we may obtain properties of (f n (x)) from those of G, 
which “generates” this sequence and is called a 
generating function of the sequence. 



Fig. 105. Team Project 14 


(c) Further applications of (13). Show that 
P n {\) = 1, P*(- 1) = (-l) n , P 2n+i(0) = 0, and 

P 2n(0) = (-I) n - 1 -3 • ■ * (2 m - I )/[2 • 4 • • • (2/!)]. 

(d) Bonnet’s recursion. 3 Differentiating (13) with 
respect to m, using (13) in the resulting formula, and 
comparing coefficients of m”, obtain the Bonnet 
recursion 

(14) (77 -I- l)P n+1 (A) = (2/1 + l)AP n (A) - 7lP n _ 1 ( X\ 

where n = I. 2, • • • . This formula is useful for 
computations, the loss of significant digits being small 
(except near zeros). Try (14) out for a few computations 
of your own choice. 


2 0L1NDE RODRIGUES (1794-1851), French mathematician and economist. 

3 OSSIAN BONNET (1819-1892), French mathematician, whose main work was in differential geometry. 
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15. (Associated Legendre functions) The associated 
Legendre functions P n k (x) play a role in quantum 
physics. They are defined by 

(15) P n k (x) = (I - x 2 ? 12 


and are solutions of the ODE 

(1 - x 2 )y" - 2xy' 

(16) f k 2 1 

+ L ,l( " + l) ~ y = °‘ 

Find Pi\x), P z (x), P 2 {x), and P 2 {x) and verify that 
they satisfy (16). 


5.4 Frobenius Method 

Several second-order ODEs of considerable practical importance — the famous Bessel 
equation among them — have coefficients that are not analytic (definition in Sec. 5.2), but 
are “not too bad,” so that these ODEs can still be solved by series (power series times a 
logarithm or times a fractional power of x, etc.). Indeed, the following theorem permits 
an extension of the power series method that is called the Frobenius method. The latter — 
as well as the power series method itself — has gained in significance due to the use of 
software in the actual calculations. 


THEOREM 


Frobenius Method 

Let b(x) and c(x) be any functions that are analytic at x = 0. Then the ODE 


( 1 ) 


" -l b( ^> ' -l c(jc) - n 

J + — y +— y -° 


has at least one solution that can be represented in the form 
00 

(2) y(x) = x r 2 = x r (a 0 + a x* + ^x 2 + ■ ■ •) (a 0 ¥= 0) 

7tt=0 

where the exponent r may be any ( real or complex) number (and r is chosen so that 
a 0 * 0). 

The ODE ( 1) also has a second solution ( such that these two solutions are linearly 
independent) that may be similar to (2) ( with a different r and different coefficients) 
or may contain a logarithmic term. (Details in Theorem 2 below.) 4 


For example, Bessel’s equation (to be discussed in the next section) 



(v a parameter) 


4 GEORG FROBENIUS (1849-1917). German mathematician, also known for his work on matrices and in 
group theory. 

In this theorem we may replace x by .v - a 0 with any number x 0 . The condition a 0 =£ 0 is no restriction; it 
simply means dial we factor out the highest possible power of .v. 

The singular point of (1) at x = 0 is sometimes called a regular singular point, a term confusing to the 
student, which we shall not use. 
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is of the form (1) with b(x) = 1 and c(jc) = x 2 — v 2 analytic at* = 0, so that the theorem 
applies. This ODE could not be handled in full generality by the power series method. 

Similarly, the so-called hypergeometric differential equation (see Problem Set 5.4) also 
requires the Frobenius method. 

The point is that in (2) we have a power series times a single power of * whose exponent 
r is not restricted to be a nonnegative integer. (The latter restriction would make the whole 
expression a power series, by definition; see Sec. 5.1.) 

The proof of the theorem requires advanced methods of complex analysis and can be 
found in Ref. [All] listed in App. 1. 

Regular and Singular Points 

The following commonly used terms are practical. A regular point of 

y" + p(x)y 4 q(x)y = 0 

is a point *0 at which the coefficients p and q are analytic. Then the power series method 
can be applied. If * 0 is not regular, it is called singular. Similarly, a regular point of the 
ODE 

h(x)y” 4 p{x)y\x) 4 q(x)y = 0 


is an x 0 at which h, p, q are analytic and H(x 0 ) ¥= 0 (so what we can divide by h and get 
the previous standard form). If x 0 is not regular, it is called singular. 

Indicial Equation, Indicating the Form of Solutions 

We shall now explain the Frobenius method for solving (1). Multiplication of (1) by x 2 
gives the more convenient form 

(1') x 2 y ft 4 xb(x)y r + c(x)y = 0. 

We first expand b(x) and c(x) in power series, 

b{x) = b 0 4- b ± x + b^x 2 4* • ■ • , c(x) = c 0 + c-pc + c 2 x 2 + - • • 

or we do nothing if b{ x) and c(x) are polynomials. Then we differentiate (2) term by term, 
finding 

00 

y ; (jc) = 2 ( m + r)a w A: m+r [m 0 + (r 4- 1 )apc 4* • • •] 

in — 0 
co 

(2*) v"M = S (« + r)(m + r - l)a m x m+r ~ 2 

m= 0 

= x r ~ 2 [r(r — l)a 0 + (r + l)ra x x +•••]. 

By inserting all these series into ( 1 ') we readily obtain 

x r [/ (r - l)a 0 +•••] + (fe 0 + t>ix + • • -)x r (ra 0 + • • •) 

( 3 ) 

+ ( c 0 + C X A - + • • -)* r («0 + a l x +•••) = 0 . 
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We now equate the sum of the coefficients of each power jc r , x r+ \ x r+2 , • • • to zero. This 
yields a system of equations involving the unknown coefficients a m . The equation 
corresponding to the power x r is 

[r(r - l) + b 0 r + c 0 )a 0 = 0. 

Since by assumption a 0 *£ 0, the expression in the brackets [* • *] must be zero. This gives 
(4) ;•(/* - l) + b Q r + c 0 = 0. 

This important quadratic equation is called the indicial equation of the ODE (1). Its role 
is as follows. 

The Frobenius method yields a basis of solutions. One of the two solutions will always 
be of the form (2), where r is a root of (4). The other solution will be of a form indicated 
by the indicial equation. There are three cases: 

Case 1. Distinct roots not differing by an integer 1, 2, 3, • • • . 

Case 2. A double root. 

Case 3. Roots differing by an integer 1, 2, 3, * • • . 

Cases 1 and 2 are not unexpected because of the Euler-Cauchy equation (Sec. 2.5), the 
simplest ODE of the form (1). Case 1 includes complex conjugate roots t\ and r 2 = T\ 
because r x — r 2 = r x — T\ = 2 i Im r x is imaginary, so it cannot be a real integer. The 
form of a basis will be given in Theorem 2 (which is proved in App. 4), without a general 
theory of convergence, but convergence of the occurring series can be tested in each 
individual case as usual. Note that in Case 2 we must have a logarithm, whereas in Case 
3 we may or may not. 


THEOREM 2 


Frobenius Method. Basis of Solutions. Three Cases 

Suppose that the ODE (1) satisfies the assumptions in Theorem 1. Let i\ and r 2 be 
the roots of the indicial equation (4). Then we have the following three cases. 

Case 1. Distinct Roots Not Differing by an Integer . A basis is 

(5) y x {x) = x\a 0 + a t x + a 2 x 2 + • • •) 
and 

(6) y 2 (x) = x r \A 0 + AiX + A z x 2 + • • •) 


with coefficients obtained successively from (3) with r = r x and r = r 2 , respectively. 
Case 2. Double Root r x = r 2 = r. A basis is 

(7) = x r (a 0 + a x x + a 2 x 2 + • • •) [r = |(1 - £> 0 )] 

{of the same general form as before) and 


( 8 ) 


y 2 W = J'iM In x + X r {A r x + A 2 x 2 + • • •) 


C* > 0). 
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EXAMPLE 1 


EXAMPLE 2 


Case 3 , Roots Differing by an Integer . A basis is 

(9) JVjlW = x\a 0 + a x x + a 2 x 2 + • • •) 

{of the same general form as before) and 

(10) y 2 {x) = kyi(x) In x 4- x\A 0 4- A x x 4- A 2 x 2 + •••)» 

where the roots are so denoted that r x — r 2 > 0 and k may turn out to be zero. 


Typical Applications 

Technically, the Frobenius method is similar to the power series method, once the roots 
of the indicial equation have been determined. However, (5)-(10) merely indicate the 
general form of a basis, and a second solution can often be obtained more rapidly by 
reduction of order (Sec. 2.1). 

Euler-Cauchy Equation, Illustrating Cases 1 and 2 and Case 3 without a Logarithm 

For the Euler-Cauchy equation (Sec. 2.5) 


x 2 y" + b 0 xy' + c 0 y = 0 


(b 0 , c 0 constant) 


substitution of y — x r gives the auxiliary equation 

r(r — I) -I- b 0 r 4- c 0 = 0, 

which is the indicial equation [and y = x r is a very special form of (2)!]. For different roots r ls r 2 we get a 
basis y 1 ~ x r \ y 2 = x 2 , and for a double root r we get a basis A r , x r In Jt. Accordingly, for this simple ODE, 
Case 3 plays no extra role. M 

Illustration of Case 2 (Double Root) 

Solve the ODE 

(11) x(x - 1)/ + (3a* - l)y' + y = 0. 

(This is a special hypergeometric equation, as we shall see in the problem set.) 

Solution . Writing (1 1) in the standard form (I ), we see that it satisfies the assumptions in Theorem 1. fWhat 
are b(x) and c(a) in (1 1)?J By inserting (2) and its derivatives (2*) into (1 1) we obtain 

CO CO 

2 On + r)(m + r - 1 )a m x m+r - 2 0» + r)(m + r - l)a TO jc m+r_1 
m=0 m»0 

(12) 

CO CO OC 

+ 3 2 0» + r)a m x m+r - 2 (m + r)a m x m+r ~ l + 2 = 0. 

m=0 m=0 m-0 

The smallest power is A r_1 , occurring in the second and the fourth series; by equating the sum of its coefficients 
to zero we have 

[-r(r - l) - r]a 0 = 0, thus r 2 = 0. 

Hence this indicial equation has the double root r = 0. 
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EXAMPLE 3 


First Solution . We insert this value r = 0 into (12) and equate the sum of the coefficients of the power 
a s to zero, obtaining 

<s - 1 )a s - (j + + 3 sa s - (.y + 1)^ +1 + a g = 0 

thus «s +1 = a $ . Hence a$ = = a 2 = ’ * ‘ • and by choosing r? 0 = 1 we obtain the solution 

jnw = 2 -v m = (M < i). 

m = 0 


Second Solution . We get a second independent solution y 2 by the method of reduction of order (Sec. 2. 1 ), 
substituting^ = mvi and its derivatives into the equation. This leads to (9), Sec. 2.1, which we shall use in this 
example, instead of starting reduction of order from scratch (as we shall do in the next example). In (9) of 
Sec. 2.1 we have p = (3.v - 1)/(a 2 - a), the coefficient of y in (I l) in standard form . By partial fractions, 

~ I P dx = "/ ^ T) dx = ' J + 7) dX = -2l"Cv- O - ln.v. 


Hence (9), Sec. 2.1. becomes 

f T , —2 — / p dx 

u = U = y x e J v 


(A " l) 2 

(A - l) 2 A 


A 


In a 

u = In a, y 2 = uyi = 2 


Vj andy 2 are shown in Fig. 106. These functions are linearly independent and thus form a basis on the interval 
0 < a < 1 (as well as on 1 < a < »). M 



Fig. 106. Solutions in Example 2 


Case 3, Second Solution with Logarithmic Term 

Solve die ODE 

(13) (A 2 - A )y" - av' + .V = 0. 

Solution. Substituting (2) and (2*) into (13), we have 

c© oc oo 

(a 2 - a) 2 ("' + >-){m + r - i)a m x m+r ~ 2 - x 2 (»> + r)a m x m+r ~ ] + 2 «„ l v m+r = 0. 

•m=0 m= 0 m= 0 

We now take a 2 , a, and a inside the summations and collect all terms with power A m * r and simplify algebraically, 

5C OC 

2 On + r - lfa m x m+r - 2 ('» + r)(m + r - l)a m x m+r ~ 1 = 0. 
m=* 0 m*» 0 

In the first scries we set m = s and in the second m = ,y + 1, thus s = m — 1. Then 

oc oc 

2 (* + r - l)V s+r - 2 (A + r + D(i + r)« s . ( I .v s+r = 0. 

s~0 s=-l 


(14) 
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The lowest power is .v r 1 (take s = - 1 in the second series) and gives the indicia! equation 

r{r - 1) = 0. 


The roots are r x = 1 and r 2 = 0. They differ by an integer. This is Case 3. 
First Solution. From (14) with r = r x = 1 we have 

3C 

2 [i' 2 « s - (* + 2 )(s + l)a s+1 ].v s+1 = 0. 

s«0 


This gives the recurrence relation 


a s+i ~ 


( s 4- 2 )(s + 1) 


(s = 0, I, ■ • •)• 


Hence ii\ — 0, a 2 = 0. • • • successively. Taking « 0 = 1, we get as a first solution y t = .v^uo = a. 

Second Solution . Applying reduction of order (Sec. 2.1), we substitute y 2 = yi« = a’w, y 2 = 4* u and 

y 2 = a*« w 4- 2// / into the ODE, obtaining 

(A- 2 — x)(.xu" 4- 2 u) - A’(A u 4* w) 4- xu = 0. 


a*m drops out. Division by a* and simplification give 

(.V 2 - x)u" + (.v - 2 )u = 0. 


From this, using partial fractions and integrating (taking the integration constant zero), we get 


* - 2 2 1 

4- , In u = In 

A A* — 1 


.Y 2 - X 


A - 1 


Taking exponents and integrating (again taking the integration constant zero), we obtain 


jf - i 


1 - -L 

7 " .v 2 


u = In .v 4- — 

X 


y 2 = xu = x In .v 4- I . 


Vi aud y 2 are linearly independent, and v 2 has a logarithmic term. Hence y x and y 2 constitute a basis of solutions 
for positive x . I 

The Frobenius method solves the hypergeometric equation, whose solutions include 
many known functions as special cases (see the problem set). In the next section we use 
the method for solving Bessel’s equation. 


PROBLEM SET 5.4 


|l-17| BASIS OF SOLUTIONS BY THE 
FROBENIUS METHOD 

Find a basis of solutions. Try to identify the series as 
expansions of known functions. (Show the details of your 
work.) 

1. .vv" + 2/ - xy = 0 2. (.v + 2)V - 2y = 0 

3. xy" + 5/ + xy = 0 

4. 2xy" + (3 — 4.v)y / + (2.v — 3)y = 0 

5. x 2 y" + 4xy' + (x 2 + 2).v = 0 

6. 4 at " + 2 y' + y = 0 

7. (x + 3 ) 2 y" - 9(x + 3)y' + 25y = 0 


8. xy" - y = Q 

9. xy" + (2a- + 1)/ + (a + l).v = 0 

10 . x 2 y" + 2a 3 / + (a 2 - 2)y = 0 

11. (a 2 + x)y" + (4a + 2)/ + 2v — 0 

12. a 2 y" + 6a/ + (4a 2 + 6)3' = 0 

13. 2a/' - (8a - 1)/ + (8a - 2)y = 0 

14. Ay" + / - xy = 0 

15. (a - 4) 2 y" - (a - 4)y' - 35y = 0 

16. a 2 v" + 4Ay' — (a 2 — 2)y = 0 

17. y" + (a - 6)y = 0 
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18. TEAM PROJECT. Hypergeometric Equation, 
Series, and Function. Gauss’s hypergeometric ODE 5 
is 


In (I 4- x) = aF(1, 1, 2; -a), 

l 4" A . o o 

In y— ^ =2xF{\, t,|;.v 2 ). 


(15) .t(1 - x)y" + [c - (a + b + 1)a]v' - aby = 0. 


Here, a , b , c are constants. This ODE is of the form 
p 2 y" + P\y' + Poy = o, where /; 2 , p j, p 0 are 
polynomials of degree 2, I, 0, respectively. These 
polynomials are written so that the series solution takes 
a most practical form, namely. 


ab a(a 4- 1 )b(b 4-1) 9 

- 1 + 777 - V + 2! c(c + I) v 


(16) 


a(a + l)(fl + 2)b{b + l)(i> + 2) , 

4* — .V 4- ' 

3! c(c 4- l)(c 4- 2) 


Find more such relations from the literature on special 
functions. 


(d) Second solution. Show that for r 2 — 1 — c the 
Frobenius method yields the following solution (where 
c ± 2, 3. 4, • • •): 


(17) 


y 2 (v) = -v 1 c | 


(a - c + I ){b -c + 1 ) 
l + * 


(a - c + l)(a - c + 2 )(b - c + 1 )U> - c + 2) 


2! (-c + 2)(— c + 3) 


■) 


Show that 


y 2 U) = ,x 1 ~ c F(a - c+ 1, b - c 4- 1,2 — c; ,v). 


This series is called the hypergeometric series. Its sum 
Vj. (. y) is called the hypergeometric function and is 
denoted by F(a, b , c: a). Here, c ^ 0,-1, -2, 

By choosing specific values of a , Z>, c we can obtain 
an incredibly large number of special functions as 
solutions of (15) [see the small sample of elementary 
functions in part (c)]. This accounts for the importance 
of (15). 

(a) Hypergeometric series and function. Show that 
the indicial equation of (15) has the roots ;*j = 0 and 
1 2 = I — c. Show that for r x — 0 the Frobenius method 
gives (16). Motivate the name for (16) by showing that 

I 

F(l, I. 1: a) = F(K b, b; x) = F(a , I, a; x) = . 

1 — A’ 

(b) Convergence. For what a or b will ( 1 6) reduce to 
a polynomial? Show that for any other cl b , c 
(c =£ 0, — I, -2, • * •) the series (16) converges when 

W < 1. 

(c) Special cases. Show that 

(1 4- a*) w = F(-;i, b> b; -a), 

(I - x) n = 1 - nxF(\ - /?, 1. 2: a), 
arctan.v = a F(|, 1, §; -a 2 ), 
arcsin x = a F(|, |: a 2 ). 


(e) On the generality of the hypergeometric 
equation. Show that 

(18) ( t 2 4- At 4- B)y 4- (C/ + D)y + Ky = 0 

with y = dy/dt. etc., constant A , B , C, D, K , and 
f 2 4- At 4- B = (t - t x )(t - t 2 \ t\ t 2 , can be reduced 
to the hypergeometric equation with independent 
variable 

t - h 
[ 2 ” ^1 


and parameters related by C/j + D = -e*(/ 2 — t x ), 
C = a 4- b 4- 1, K = ab . From this you see that (15) 
is a “normalized form” of the more general (18) and 
that various cases of (18) can thus be solved in terms 
of hypergeometric functions. 


19-24 1 HYPERGEOMETRIC EQUATIONS 

Find a general solution in terms of hypergeometric 
functions. 


19. x( I - x)y" + (| - 2x)/ - £)' = 0 

20. 2a(1 - x)y" - (1 + 6a-)/ - 2v = 0 

21. at( I - x)y" + %y' + 2y = 0 

22. 3/( 1 + l)y + ty - y = 0 

23. 2 U 2 - 5/ + 6)5’ + (2/ - 3)j - 8.v = 0 

24. 4(/ 2 - 3/ + 2)5' - 2v + v = 0 


“CARL FRIEDRICH GAUSS (1777-1855). great German mathematician. He already made the first of his great 
discoveries as a student at Helmstedt and Gottingen. In 1807 he became a professor and director of the Observatory 
at Gottingen. His work was of basic importance in algebra, number theory, differential equations, differential 
geometry. non-Euclidean geometry, complex analysis, numeric analysis, astronomy, geodesy, electromagnetism, 
and theoretical mechanics. He also paved the way for a general and systematic use of complex numbers. 
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5,f Bessel's Equation. Bessel Functions J v [x) 

One of the most important ODEs in applied mathematics in Bessel’s equation , 6 
(1) x 2 y " + xy f -h (x 2 - v 2 )y = 0. 

Its diverse applications range from electric fields to heat conduction and vibrations (see 
Sec. 1 2.9). It often appears when a problem shows cylindrical symmetry (just as Legendre’s 
equation may appear in cases of spherical symmetry). The parameter v in (1) is a given 
number. We assume that v is real and nonnegative. 

Bessel’s equation can be solved by the Frobenius method, as we mentioned at the 
beginning of the preceding section, where the equation is written in standard form 
(obtained by dividing (1) by x 2 ). Accordingly, we substitute the series 


(2) y(x) = 2 a m.x m+r (a 0 * 0) 

m= 0 

with undetermined coefficients and its derivatives into (1). This gives 

oo oo 

2 (m + r){m + r - 1 )a m x m+r + 2 0» + r)a m x m+r 

■m=0 m=0 

OO 00 

+ 2 a m x m+r+2 - V 2 2 a m x m+T = 0. 

7n=0 m - 0 

We equate the sum of the coefficients of x slr to zero. Note that this power x s * r 
corresponds to m = s in the first, second, and fourth series, and to m = s — 2 in the 
third series. Hence for s = 0 and s = 1, the third series does not contribute since 
m ^ 0. For s = 2, 3, • • • all four series contribute, so that we get a general formula for 
all these s. We find 

(a) r(r — l)tf 0 + ra 0 v 2 a 0 = 0 (s = 0) 

(3) (b) (r + 1 )rai + (r + l)a x — = 0 (s = 1) 

(c) (j + r)(s + r — 1 )a s + (s + r)a s 4- a s _ 2 - v 2 a s = 0 (s = 2, 3, • • •)• 

From (3a) we obtain the indicial equation by dropping a 0y 

(4) (r + !/)(/• - v) = 0. 

The roots are = v(^ 0) and r 2 = -v. 


6 FRIEDRICH WILHELM BESSEL (1784-1846), German astronomer and mathematician, studied astronomy 
on his own in his spare time as an apprentice of a trade company and finally became director of the new Konigsberg 
Observatory. 

Formulas on Bessel functions are contained in Ref. [GRI] and the standard treatise [A13J. 
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Coefficient Recursion for r = r x = v. For r = v , Eq. (3b) reduces to (2v + 1)^ = 0. 
Hence a ± = 0 since v ^ 0. Substituting r = v in (3c) and combining the three terms 
containing a s gives simply 

(5) (s 4- 2 v)sa $ 4- ct $ _ 2 = 0. 

Since = 0 and v ^ 0, it follows from (5) that a z = 0, a 5 = 0, • • • . Hence we have 
to deal only with even-numbered coefficients a $ with s = 2m. For s = 2m, Eq. (5) becomes 


(2m 4- 2v)2tna 2m + « 2 m -2 = 0. 


Solving for a 2m gives the recursion formula 


( 6 ) 


a 2m 


1 

2 2 m(v 4* m) 


fl 2m-2> 


m = 1, 2, • • • . 


From (6) we can now determine * * * successively. This gives 


6,2 2\v + 1) 

fl 2 _ «o 

fl4 “ 2 2 2 (v + 2) " 2^! («/ + 1)(*/ + 2) 

and so on, and in general 


(7) 


= (-D m fl 0 

° 2m 2 Zm m\ ( v + l)(v + 2) • • • (v + m) ’ 


/n = 1, 2, • • • . 


Bessel Functions J n (x) For Integer v = n 

Integer values of v are denoted by n. This is standard. For v = n the relation (7) becomes 


( 8 ) 


(-lrop 

a 2 m 2 2m m\ ( n 4- l)(n 4- '2) • • • (n 4- m) 


m = 1 , 2 , • • • . 


a 0 is still arbitrary, so that the series (2) with these coefficients would contain this arbitrary 
factor a 0 . This would be a highly impractical situation for developing formulas or 
computing values of this new function. Accordingly, we have to make a choice. a 0 = 1 
would be possible, but more practical turns out to be 

(9> a °=ih- 


because then nl(n + 1) • • • (n + m) — (m + n)\ in (8), so that (8) simply becomes 

= (~l) m 

° 2m 2 2m+n ml (n + m)\ ' 


( 10 ) 


m = 1, 2, • * * . 
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EXAMPLE 1 


This simplicity of the denominator of (10) partially motivates the choice (9). With these 
coefficients and r x = v = n we get from (2) a particular solution of (1), denoted by J n (x) 
and given by 


( 11 ) 


Ux) = Jt B 2 

Ttl — O 


(~l) m x 2m 
2 Zm+n m\ (n + »?)! ‘ 


J n (x) is called the Bessel function of the first kind of order n. The series (11) converges 
for all .v, as the ratio test shows. In fact, it converges very rapidly because of the factorials 
in the denominator. 


Bessel Functions J 0 (x) and J y [x) 

For n = 0 we obtain from (11) the Bessel function of order 0 


( 12 ) 


J 0 M = S 


(-l) m v 2m 


2 4 

.V „V 

= 1 o o + 


2 2m (m!) 2 2 2 ( I !) 2 2 4 (2!) 2 2 6 (3!) 2 


+ - 


which looks similar to a cosine (Fig. 107). For 7/ = 1 we obtain the Bessel function of order 1 


(13) J x (x) = 2 


j y*» v 2m-+-i 


-V 5 

+ 


771 = 0 


2 2m+1 m! (in + I)! 2 2 S 1!2! 2*2)3! 2 7 3!4! 


+ - 


which looks similar to a sine (Fig. 107). But the zeros of these functions are not completely regularly spaced 
(see also Table A1 in App. 5) and the height of the ‘'waves’* decreases with increasing .r. Heuristically, n 2 /x 2 
in (1) in standard form [(1) divided by .v 2 ] is zero (if n = 0) or small in absolute value for large .v, and so is 
yV.v. so that then Bessel’s equation comes close to y" 4 ■ y = 0. the equation of cos.v and sin a" also y'/.v acts 
as a “damping term,’* in part responsible for the decrease in height. One can show that for large a\ 


(14) 



7T 

7 


) 


where ~ is read “asymptotically equal” and means ihai for fixed n the quotient of the two sides approaches 1 
as a* — > 

Formula (14) is surprisingly accurate even for smaller .v (> 0). For instance, it will give you good starting 
values in a computer program for the basic task of computing zeros. For example, tor the first three zeros of J 0 
you obtain the values 2.356 (2.405 exact to 3 decimals, error 0.049), 5.498 (5.520, error 0.022), 8.639 (8.654, 
eiror 0.015), etc. I 



Fig. 107. Bessel functions of the first kind J 0 and Jt 
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Bessel Functions J„(x) for any v 0. Gamma Function 

We now extend our discussion from integer v = n to any v > 0. All we need is an 
extension of the factorials in (9) and (1 1) to any v. This is done by the gamma function 
F(» defined by the integral 


(15) 


r(v) = f e-'t'- 1 dt 

J o 


(v>0). 


By integration by parts we obtain 


r 

+ 1) = I e~Y dt = - 

■'n 


e t 


4 

J t\ 


e-'r* 1 dt. 


The first expression on the right is zero. The integral on the right is T(i'). This yields the 
basic functional relation 


(16) 


r(v + i) = i'r(i'). 


Now by (15) 



05 

= 0 - (- 1 ) = 1 . 
o 


From this and (16) we obtain successively F(2) = f(l) = 1 !, T(3) = 2T(2) = 2!, • • • 
and in general 


(17) 


T(/z + 1) = n! 


(n = 0, 1, • • •)• 


This shows the the gamma function does in fact generalize the factorial function. 

Now in (9) we had a 0 = l/(2"n!). This is l/(2 n F(n + 1)) by (17). It suggests to choose, 
for any v. 


( 18 ) 


1 

a ° ~ 2T(i/ + 1) ‘ 


Then (7) becomes 


(~ir 

(, 2 m 2 zm m] (y + 1)(v + 2) • • • (V + m)2’T(v + 1) ’ 


But (16) gives in the denominator 

(v + l)T(v + I) - T(v + 2), (; v + 2)T(v + 2) = T(v + 3) 

and so on, so that 

(v + \)(v 4- 2) ■ • • (v 4- m)T(p + 1) = T(v + 1). 
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Hence because of our (standard!) choice (18) of a 0 the coefficients (7) simply are 

(- 1 )”' 

(l9) ~ 2 2m+ "rn! T(v+ m+ l)' 

With these coefficients and r= i\ = v we get from (2) a particular solution of (1), denoted 
by J£x) and given by 


( 20 ) 


CO 

•/„(*) = *" 2 

m= 0 


(— i) w jg 2m 

2 2m+ "m\ T(v + m + 1) ' 


J,,(x ) is called the Bessel function of the first kind of order v. The series (20) converges 
for all a, as one can verify by the ratio test. 


General Solution for Noninteger v. Solution )_„ 

For a general solution, in addition to J v we need a second linearly independent solution. 
For v not an integer this is easy. Replacing v by — ^in (20), we have 


( 21 ) 


GO 

/-„(*) = A'”" X 

m= 0 


(-i)V m 

2 2m ~'m! r On - v + 1) ' 


Since Bessel’s equation involves v 2 , the functions J v and are solutions of the 
equation for the same v. If v is not an integer, they are linearly independent, because 
the first term in (20) and the first term in (21) are finite nonzero multiples of x v and 
x~ l \ respectively, x = 0 must be excluded in (21) because of the factor a*”" (with v > 0). 
This gives 


THEOREM 1 


General Solution of Bessel's Equation 

If v is not an integer, a general solution of Bessel* s equation for all x & 0 is 
(22) y(x) = CjJ^a-) + c 2 /_„(*). 


But if v is an integer, then (22) is not a general solution because of linear dependence: 


THEOREM 2 


Linear Dependence of Bessel Functions X, and J_ n 

For integer v — n the Bessel functions J n {x) and J- n (x) are linearly dependent t 
because 

(23) J- w ( x) = (-1 ) n J n (x) (n = 1, 2, • • •). 
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PROOF We use (21 ) and let v approach a positive integer n. Then the gamma functions in the 
coefficients of the first n terms become infinite (see Fig. 552 in App. A3.1), the 
coefficients become zero, and the summation starts with m = n. Since in this case 
T(w -«+!) = (m - /?)! by (17), we obtain 


90 

J-nM = 2 

m=n 


(-D 




2 2m ~ n m\ (m - n)\ 


= 2 

s—0 


j yt+s^2s+n 

2 2s+n (n + .?)! s! 


(m = n + s). 


The last series represents ( — l) n 7 n (A*), as you can see from (11) with m replaced by s . This 
completes the proof. ■ 


A general solution for integer n will be given in the next section, based on some further 
interesting ideas. 


Discovery of Properties From Series 

Bessel functions are a model case for showing how to discover properties and relations of 
functions from series by which they are defined. Bessel functions satisfy an incredibly large 
number of relationships — look at Ref. [A 13] in App. 1; also, find out what your CAS 
knows. In Theorem 3 we shall discuss four formulas that are backbones in applications. 


THEOREM 3 


Derivatives, Recursions 

The derivative of J r (x) with respect to x can he expressed by J t ,-\(x) or J v+1 (x) by 
the formulas 


(24) 


(a) [x v J ',(*)}' = 

(b) [j rv„(*)]' = 


Furthermore , J v (x) and its derivative satisfy the recurrence relations 


(24) 


(c) y ? _i(A‘) + J v+1 (x) = — J v (x) 

(d) /^ x ( x) - /„ +1 (a) = 2 jUx). 


PROOF (a) We multiply (20) by x r and take x 2p under the summation sign. Then we have 

~ (-l) m A 2m+2 " 

x‘JM) - 2 2 Zm+u rn\ rv +m + 1 ) ■ 

m=o 

We now differentiate this, cancel a factor 2, pull x 2 *'" 1 out, and use the functional 
relationship T(z/ + m + 1) = {v + m)T(u 4- m) [see (16)]. Then (20) with v — \ instead 
of v shows that we obtain the right side of (24a). Indeed, 


(x-’j v y = 2 

m—0 


(— l ) m 2(m + tix *****- 1 

2 2m+ "ml T(v -I- m + 1) 


= xV- 1 2 

m=0 


(~l) m x 2m 

2 2m+v ~ l m\ + m) ' 
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EXAMPLE 2 


(b) Similarly, we multiply (20) by x ", so that in (20) cancels. Then we differentiate, 
cancel 2m, and use m\ = m(m — 1)1. This gives, with + 1, 




(~l) m A- 


v 2m- 1 


= 2 




, 2 2m+ "-‘(/« - 1)! r(v + HI +1) 2 2s+,,+1 s! T(h + s + 2) 

1—1 5 — u 


Equation (20) with v + 1 instead of v and s instead of m shows that the expression on 
the right is -x~"J lr+l (x). This proves (24b). 

(c), (d) We perform the differentiation in (24a). Then we do the same in (24b) and 
multiply the result on both sides by x 2v . This gives 


(a*) vjT'J, + x-j'„ = at’ 7,,-1 
(b*) - vxT'j. + x v j'„ = -x"J„ +v 


Substracting (b*) from (a*) and dividing the result by x” gives (24c). Adding (a*) and 
(b*) and dividing the result by x v gives (24d). ■ 


Application of Theorem 3 in Evaluation and Integration 

Formula (24c) can be used recursively in the form 


Ah-iC*) = Y J ' Xx) ~ 


for calculating Bessel functions of higher order from those of lower order. For instance, J 2 (x) = 2 J\(x)/x — J 0 Cv), 
so that J 2 can be obtained from tables of J 0 and A (in App. 5 or. more accurately, in Ref. [GR1] in App. 1). 

To illustrate how Theorem 3 helps in integration, we use (24b) with v — 3 integrated on both sides. This 
evaluates, for instance, the integral 


/ = f x~*J 4 (x) dx = -x %(x) 


2 1 

- — ~ A(2) + Ad)* 


A table of J z (on p. 398 of Ref. [GR 1 ]) or your CAS will give you 

- £-0.128943 + 0.019563 - 0.003445. 


Your CAS (or a human computer in precomputer times) obtains A from (24), first using (24c) with u = 2, 
that is, A = 4.v” 1 / 2 “ A* lhen (24c) wilh v ~ h that A = 2 a *” 1 A - J 0 . Together, 


/ = .v" 3 (4.v" 1 (2v” l J 1 - J 0 ) - J t ) 


= -I 


[2y x (2) - 2y 0 (2) - y x (2)] + [8 Ad) - 4 Ad) - Ad)] 
= 4A(2) + iW + ?Ad) - 4 Ad). 


This is what you get. for instance, with Maple if you type int(- • •)- And if you type evalf(int(- • ■))» you obtain 
0.003445448, in agreement with the result near the beginning of the example. ® 


In the theory of special functions it often happens that for certain values of a parameter 
a higher function becomes elementary. We have seen this in the last problem set, and we 
now show this for J v . 
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THEOREM 4 


PROOF 


EXAMPLE 3 


Elementary J„ for Half-Integer Order v 

Bessel functions J v of orders ±5, ±§, ±|, • * • are elementary; they can be expressed 
by finitely many cosines and sines and powers of x. In particular , 


(25) 


(a) 



(b) 





When v = \ , then (20) is 

~ (-\y n x 2m IT ^ (-[) m x 2m+ 1 

Jm(X) ~ 2 2,n+1/2 ;n! TQn + |) " V * l M m\ T(m + §) ' 

To simplify the denominator, we first write it out as a product AB , where 

4 = 2 m m\ = 2m (2m - 2)(2m - 4) • • • 4 • 2 

and [use (16)] 

B = 2 m+1 V(m + §) = 2 TO+1 (m + |)(w - §) • • • § • |r(|) 

= (2m + 1)(2 m — 1) • • • 3 • 1 • Vir ; 

here we used 

(26) f(i) = Vir. 


We see that the product of the two right sides of A and B is simply (2m + OlVir, so that 
J lf 2 becomes 


J i/z( x ) “* 


nr ^ ( -i) w A- 2m+i 

V ( 2w + ‘)! 



as claimed. Differentiation and the use of (24a) with v = \ now gives 


W~xJ in (x)]' = 



x V2 J- m (x). 


This proves (25b). From (25) follow further formulas successively by (24c), used as in 
Example 2. This completes the proof. ■ 

Further Elementary Bessel Functions 

From (24c) with v = | and v — — J and (25) we obtain 


•W-v) = 7 Jm(x) - J.y s(.v) = JZ _ cos . v j 

1 / 2 / COS A* \ 

■/-3/2W = - 7 J-iisfx) - htdx) = - J— 1—7- + Sin.rl 


respectively, and so on. 
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We hope that our study has not only helped you to become acquainted with Bessel 
functions but has also convinced you that series can be quite useful in obtaining various 
properties of the corresponding functions. 


PROBLEM SET 5.5 


1. (Convergence) Show that the series in (11) converges 
for all a\ Why is the convergence very rapid? 

2. (Approximation) Show that for small |a| we have 
J 0 1 - 0.25a* 2 . From this compute 7 0 (x) for 

a* = 0, 0.1, 0.2, * • • , 1.0 and determine the error by 
using Table A1 in App. 5 or your CAS. 

3* (“Large” values) Using (14), compute 7 0 (a) for 
a* = 1.0, 2.0, 3.0, • • • , 8.0, determine the error by 
Table A1 or your CAS, and comment. 

4. (Zeros) Compute the first four positive zeros of 7 0 ( x) 
and 7] (a) from (14). Determine the error and comment. 

5-20 1 ODEs REDUCIBLE TO BESSEL’S 
EQUATION 

Using the indicated substitutions, find a general solution in 
terms of 7„ and 7_„ or indicate when this is not possible. 
(This is just a sample of various ODEs reducible to Bessel’s 
equation. Some more follow in the next problem set. Show 
the details of your work.) 

5. (ODE with two parameters) 

x 2 y" + a y' + (A 2 a 2 - v 2 )y = 0 (Aa = z) 

6. x z y n + x :y f + (a 2 - ^ )y = 0 

7. aV 4* xy r + \(x - v 2 )y = 0 (Va = z) 

8. (2a 4- 1)V + 2(2a + 1 )/ + 16a(a + J)y = 0 

(2a 4-1 = z) 

9. xv" — y f 4- 4a y = 0 (y = x u, 2x = z ) 

10. a 2 / + Ay' 4- |(a 2 - Y)y = 0 (a = 2 z) 

11. a y" 4- (2j/ 4- l)y' 4- xy = 0 (y = x~ ¥ u) 

12. A 2 y" 4- xy* 4- 4(a 4 - v 2 )y = 0 (a 2 = z) 

13. .v 2 y" 4- xy 1 4- 9(a 6 - v 2 )y = 0 (a 3 = z) 

14. y" 4- (e 2x - |)y = 0 (e x = z ) 

15. Ay" 4- y = 0 (y = Va w, 2Va = z) 

16. 16a 2 y" 4- 8 Ay' 4- (a 1/2 4- g)y = 0 
(y = a 1/4 «, a 1/4 = z) 

17. 36A 2 y" 4- 18Ay' 4* Va v = 0 
(y = a 1/4 «, |a 1/4 = z) * 

18. A 2 y" 4- Ay' 4- Vjry = 0 (4 a 1/4 = z) 

19. A 2 y" 4- \xy 4- VAy = 0 (y = a 2/ V 4a 1/4 = z) 

20. A 2 y" 4- (1 — 2 ^).vy ' 4- i/ 2 (a 2v 4* 1 — i^ 2 )y = 0 

(y = a 1 '*/, a" = z) 


2 1-28 1 APPLICATION OF (24): DERIVATIVES, 
INTEGRALS 

Use the powerful formulas (24) to do Probs. 21-28. (Show 
the details of your work.) 

21. (Derivatives) Show that Jq( a) = -7 X ( a), 

AM = VoW - A (a) /a, y^(A) = J[7 x (jc) - 7 3 (a)]. 

22. (Interlacing of zeros) Using (24) and Rolle’s theorem, 
show that between two consecutive zeros of 7 0 (a) there 
is precisely one zero of 7j(a). 

23. (Interlacing of zeros) Using (24) and Rohe’s theorem, 
show that between any two consecutive positive zeros 
of J n (x) there is precisely one zero of 7, 1+ i(a). 

24. (Bessel’s equation) Derive (I) from (24). 

25. (Basic integral formulas) Show that 

Ja'V^U) dx = a"7„(a) 4- c, 
jx~‘ 7„ +1 (a) dx = -x~“J v {x) + c, 

/•/.,+ i(-v) dx = /•/„_,(*) dx - 24(a). 

26. (Integration) Evaluate Jx~ 1 J 4 (x) dx. (Use Prob. 25; 
integrate by parts.) 

27. (Integration) Show that 

fx 2 J 0 (x) dx = **/,(*) + a7 0 (a) - fio(x) dx. (The 

last integral is nonelementary; tables exist, e.g. in Ref. 
TAJ 3) in App, 1.) 

28. (Integration) Evaluate J 7 5 (a) dx. 

29. (Elimination of first derivative) Show that y = uv 
with u(x) = exp (-| / p{ a) dx) gives from the ODE 
y" 4- p{x)y r 4- q(x)y = 0 the ODE 

+ [<?(*) - ipOif ~ \p'(x )\ « = 0 

no longer containing the first derivative. Show that for 
the Bessel equation the substitution is y = «a~ 1/2 and 
gives 


(27) 


a 2 m" 4- (a 2 4- 1 — v 2 )u — 0. 
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30. (Elementary Bessel functions) Derive (25) in 
Theorem 4 from (27). 

31. CAS EXPERIMENT. Change of Coefficient. Find 
and graph (on common axes) the solutions of 

y ,r + fa” V + y = 0, y(0) = 1. y ; (0) = 0, 

for k = 0, 1, 2, • • • , 10 (or as far as you get useful 
graphs). For what k do you get elementary functions? 
Why? Try for noninteger A\ particularly between 0 and 
2, to see the continuous change of the curve. Describe 
the change of the location of the zeros and of the 
extrema as k increases from 0. Can you interpret the 
ODE as a model in mechanics, thereby explaining your 
observations? 

32. TEAM PROJECT. Modeling a Vibrating Cable 
(Fig. 108). A flexible cable, chain, or rope of length L 
and density (mass per unit length) p is fixed at the upper 
end (a* = 0) and allowed to make small vibrations 
(small angles a in the horizontal displacement m(a, /), 
r = time) in a vertical plane. 

(a) Show the following. The weight of the cable below 
a point x is W(x) = pg(L - x). The restoring force is 
F(x) = W sin a ~ Wu x . u x = du/dx. The difference in 
force between a* and .v 4- Ax is Ax (Wu x ) x . Newton’s 
second law now gives 

p A.v u tl = A.v pg[(L - x)u x ] x . 

For the expected periodic motion 

u( jc, /) = v(a) cos (cor -t* 8) the model of the problem 

is the ODE 

(L - x)y” - y f + A 2 y = 0, A 2 = a ?!g. 

(b) Transform this ODE to y + s~ l y -t- y = 0, 
v = dyJds, s — 2Az 1/2 t z — L — a, so that the 
solution is 

y(.v) = y 0 (2wV(L - .v)/,?). 


(c) Conclude that possible frequencies coll'irzxz those 
for wliich s = IcoVUg is a zero of J Q . The 
corresponding solutions are called normal modes. 
Figure 1 08 shows the first of them. What does the second 
normal mode look like? The third? What is the frequency 
(cycles/min) of a cable of length 2 m? Of length 10 m? 



Equilibrium 

position 

Fig. 108. Vibrating cable in Team Project 32 

33. CAS EXPERIMENT. Bessel Functions for Large x. 

(a) Graph J n (x) for « = 0, • • • , 5 on common axes. 

(b) Experiment with ( 14) for integer /?. Using graphs, 
find out from which x = „v w on the curves of ( 1 1) and 
(14) practically coincide. How does x n change with n? 

(c) What happens in (b) if n = ±|? (Our usual 
notation in this case would be v.) 

(d) How does the error of (14) behave as function 
of x for fixed /t? [Error = exact value minus 
approximation (14).] 

(e) Show from the graphs that 7 0 (a ) has extrema where 
J x (x) = 0. Which formula proves this? Find further 
relations between zeros and extrema. 

(f) Raise and answer questions of your own, for 
instance, on the zeros of J 0 and J x . How accurately are 
they obtained from (14)? 


5.6 Bessel Functions of the Second Kind Y v (x) 

From the last section we know that J t , and form a basis of solutions of Bessel’s 
equation, provided v is not an integer. But when v is an integer, these two solutions are 
linearly dependent on any interval (see Theorem 2 in Sec. 5.5). Hence to have a general 
solution also when v = n is an integer, we need a second linearly independent solution 
besides J n , This solution is called a Bessel function of the second kind and is denoted 
by Y n . We shall now derive such a solution, beginning with the case n = 0. 

n = 0: Bessel Function of the Second Kind Y 0 (x) 

When n = 0, Bessel’s equation can be written 


( 1 ) 


xy" + y' + xv = 0. 
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Then the indicial equation (4) in Sec. 5.5 has a double root r = 0. This is Case 2 in 
Sec. 5.4. In this case we first have only one solution, J 0 (x). From (8) in Sec. 5.4 we see 
that the desired second solution must be of the form 


oc 

(2) y 2 (x) = Mx) In x + 2 A m x m . 

m= 1 

We substitute y 2 and its derivatives 


j =o 

v 2 = Jo In a* 4 — - +2 /wA m jc w_1 

^ m=l 

yl = ]nx + ^ - 4 + 2 *(« - DA m x— 2 

* * m-1 

into (1). Then the sum of the three logarithmic terms xJq In x , to and a\ 7 0 In a is zero 

because J 0 is a solution of (1). The terms — J 0 /x and J 0 f. x (from xy" and y') cancel. Hence 
we are left with 


cc oc oc 

2Jq + 2 m ( w ~ + 2 + 2 V“ +1 = 0. 

m=l m=l m=l 

Addition of the first and second series gives 'Zrn 2 A m x m ~ l . The power series of Jq(x) is 
obtained from (12) in Sec. 5.5 and the use of ml/m = (m — 1)! in the form 


, ~ (— l) m 2rnA* 2m “ 1 ~ (-irx 2 — 1 

J o(x) - 2 j 2 2m ( ,)2 - 2 ; 2 2 m_ 1 m! (m - 1)! ' 

W=1 V 7 7/1=1 V 7 

Together with S/n 2 A m x m_1 and 2/4 m .v m+1 this gives 


(3*) 


« ( _|)m v 2m— i 

, 2 2m ~ 2 tn\ (m - 1)! 

m=l v 7 


+ 2 ni 2 A m x m 1 + 2 A m x m+1 = 0. 

771=1 771=1 


First, we show that the A m with odd subscripts are all zero. The power x° occurs only in 
the second series, with coefficient A ± . Hence A x = 0. Next, we consider the even powers 
x 2s . The first series contains none. In the second series, m — 1 = 2s gives the term 
(2s 4- * n toe third series, m + 1 = 2s. Hence by equating the sum of the 

coefficients of x 23 to zero we have 


(Is 4- 1 ) 2 A 2s+l + A*.! = 0, 5=1,2,--. 

Since A ± = 0, we thus obtain A 3 = 0, A 5 = 0, • • • , successively. 

We now equate the sum of the coefficients of a* 254 * 1 to zero. For 5 = 0 this gives 

— 1 4- 4A 2 = 0, thus A 2 = 

For the other values of 5 we have in the first series in (3*) 2m — 1 = 25* -F 1, hence 
in = 5 4- 1, in the second m 1 = 25 4- 1, and in the third m 4- 1 = 25 4- 1 . We thus obtain 

(-If * 1 

2*(, + 1)! ,! + (25 + 2)2A — + = °‘ 
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For s = 1 this yields 


— + I6A4 + A 2 = 0, 

o 


thus 


A. = - 


128 


and in general 

( — I ) m— 1 /ll 1 \ 

(3) (> + J + J + '" + -) • 


m = 1, 2, 


Using the short notations 


(4) 


1 


1 


>h ~ 1 Ki - 1 + — + ••• H 


2 m 

and inserting (4) and A x = A 3 = • • * = 0 into (2), we obtain the result 


m — 2, 3, 


v 2 W = JqM In jt + 2 


1 * l n >n *m 


(5) 


“ 2 2m (m!f 


-J 0 M ,„ X+ ±S- 1 L X * + -J±; jl ‘- + 


Since J 0 and y 2 are linearly independent functions, they form a basis of (1) for a* > 0. 
Of course, another basis is obtained if we replace y 2 by an independent particular solution 
of the form a(y 2 + bJ 0 ), where a 0) and b aie constants. It is customary to choose 
a = *2, far and b — y — In 2, where the number y = 0.577 215 664 90 • • • is the so-called 
Euler constant, which is defined as the limit of 

1 1 

1 + — + ••• + In .v 

2 5 


as s approaches infinity. The standard particular solution thus obtained is called the Bessel 
function of the second kind of order zero (Fig. 109) or Neumann’s function of order 
zero and is denoted by Y 0 (x). Thus [see (4)] 


( 6 ) 


Y 0 (x) = 


2_ 
7 T 



+ 2 

77? = 1 


(~j r~x 

2 2w (m!) 2 



For small .v > 0 the function Y 0 (x) behaves about like ln.v (see Fig. 109, why?), and 
K 0 (jr) — * — » as x — * 0. 


Bessel Functions of the Second Kind V n (x) 

For v = n = 1 , 2, • • • a second solution can be obtained by manipulations similar to those 
for n = 0, starting from (10), Sec 5.4. It turns out that in these cases the solution also 
contains a logarithmic term. 

The situation is not yet completely satisfactory, because the second solution is defined 
differently, depending on whether the order v is an integer or not. To provide uniformity 
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of formalism, it is desirable to adopt a form of the second solution that is valid for all 
values of the order. For this reason we introduce a standard second solution Y v (x) defined 
for all v by the formula 


(7) 


(a) Y tt (x) = — r~ — [AX*) cos vi t - •/_„(*)] 

sin vi t 

(b) Y n (x) = lim Y v ( x). 


This function is called the Bessel function of the second kind of order v or Neumann’s 
function 7 of order v. Figure 109 shows Y 0 (x) and Y x (x). 

Let us show that J v and Y„ are indeed linearly independent for all v (and x > 0). 

For noninteger order k the function /,.(*) is evidently a solution of Bessel’s equation 
because AX*) and y_„(*) are solutions of that equation. Since for those v the solutions J v 
and are linearly independent and Y v involves y_„, the functions J v and Y v are linearly 
independent. Furthermore, it can be shown that the limit in (7b) exists and Y n is a solution 
of Bessel’s equation for integer order; see Ref. [A13] in App. 1. We shall see that the 
series development of F n (*) contains a logarithmic term. Hence y n (x) and T n (*) are linearly 
independent solutions of Bessel’s equation. The series development of Y n (x ) can be 
obtained if we insert the series (20) and (21), Sec. 5.5, for J„(x) and y_,X*) into (7a) and 
then let v approach n\ for details see Ref. [A 13]. The result is 


( 8 ) 


2 


Y n (x) = - J n (x) 

7 T 


/ x \ £ f (~i r~Vv + h m+n ) 

{ ln 2 7 17 ", 2 Zm+n m\ (m + «)! 

' 77T =U 


.v 2m 


LlV (n ~ m ~ 1)1 ^ 

it " 2 Zm ~ n m\ 
m - 0 


where * > 0, n = 0, 1, • • • , and [as in (4)] h 0 = 0, h x = 1, 
1 


Kn = 1 + ^ + 


1 1 1 
m 2 m -r n 



Fig. 109. Bessel functions of the second kind Y 0 and Y,. 
(For a small table, see App. 5.) 


7 CARL NEUMANN ( 1832-1925). German mathematician and physicist. His work on potential theory sparked 
the development in the field of integral equations by VITO VOLTERRA (1860-1940) of Rome. ERIC IVAR 
FREDHOLM (1866-1927) of Stockholm, and DAVID HILBERT (1862-1943) of Giittingen (see the footnote 
in Sec. 7.9). 

The solutions K„(.v) are sometimes denoted by V,,(.v); in Ref. [A 13] they are called Weber’s functions; Euler’s 
constant in (6) is often denoted by C or ln y. 


202 


CHAP. 5 Series Solutions of ODEs. Special Functions 


For n = 0 the last sum in (8) is to be replaced by 0 [giving agreement with (6)]. 
Furthermore, it can be shown that 

Y-nM = (-1 )%W. 

Our main result may now be formulated as follows. 


THEOREM 1 


General Solution of Bessel’s Equation 

A general solution of Bessel 's equation for all values of v {and x > 0) is 
(9) y(x) = C.JM + C 2 Y„(x). 


We finally mention that there is a practical need for solutions of Bessel’s equation that 
are complex for real values of x. For this purpose the solutions 


( 10 ) 


= J„(x) + iY,Xx ) 
H™(x) = J,Xx) - iY,Xx) 


are frequently used. These linearly independent functions are called Bessel functions of 
the third kind of order v or first and second Hankel functions 8 of order v. 

This finishes our discussion on Bessel functions, except for their “orthogonality,” which 
we explain in Sec. 5.7. Applications to vibrations follow in Sec. 12.9. 


TO - OB - m iF 5 r ET_35 £6 


1-10 


SOME FURTHER ODEs REDUCIBLE TO 
BESSEL’S EQUATIONS 


12. CAS EXPERIMENT. Bessel Functions for Large*. 
It can be shown that for large *, 


(See also Sec. 5.5.) 

Using the indicated substitutions, find a general solution in 
terms of J v and Y Indicate whether you could also use J_ v 
instead of Y v . (Show the details of your work.) 

1. x 2 y u + xy* + (jc 2 - 25 )y = 0 

2. x 2 y" + xy* + (9.v 2 - £)y = 0 (3* = z) 

3. 4.x y" + 4 y' + y = 0 (V* = z ) 

4. xy" + y f + 36y = 0 (12V* = z) 

5. x 2 y ,f + xy* + (4* 4 — 16)y = 0 (* 2 = z ) 

6. x*y" + xy' + (* 6 - \)y = 0 (§x 3 = z) 

7. xy" + I Iy r + xy = 0 (y = x~ 5 u) 

8. y" + 4a* 2 v = 0 (y = mV*, x 2 = z) 

9. a 2 )’" - 5.vy' + 9(x 6 - 8 )y = 0 (y = x 3 u, * 3 = z) 
10. xy" + ly' -I- 4*y = 0 (y = x~ z u, 2x = z) 


(11) y n (*) — V2/(7Tjc) sin (a* — — ^7r) 

with ~ defined as in (14) of Sec. 5.5. 

(a) Graph Y n (x) for n = 0, • * • , 5 on common axes. 
Are there relations between zeros of one function and 
extrema of another? For what functions? 

(b) Find out from graphs from which x — x n on 
the curves of (8) and (11) (both obtained from your 
CAS) practically coincide. How does x n change 
with n? 

(c) Calculate the first ten zeros * w , m = 10, 

of Y 0 (x) from your CAS and from (11). How does the 
error behave as m increases? 


11. (Hankel functions) Show that the Hankel functions ( 10) 
form a basis of solutions of Bessel' s equation for any v. 


(d) Do (c) for Yi(x) and Y 2 (x). How do the errors 
compare to those in (c)? 


8 HERMANN HANKEL (1839-1873), German mathematician. 
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13. Modified Bessel functions of the first kind of order v 

are defined by I„(x) = i = V^T. Show that 

I „ satisfies the ODE 

(12) a 2 / 4- Ay' - Cv 2 + v 2 )y = 0 
and has the representation 

oe j.2m+v 

(13) I v {x) = 2 2 2m+u m\ r(m + v + 1) ' 

7tl Ka 0 V 


14. (Modified Bessel functions If) Show that 7„(a) is real 
for all real x (and real v ), 7„(a) ^ 0 for all real x ^ 0, 
and I- n (x) = /„.(.v), where n is any integer. 

15. Modified Bessel functions of the third kind (sometimes 
called of the second kind) are defined by the formula (14) 
below. Show that they satisfy the ODE (12). 

(14) K„(x) = 7 [/_„(*) - /„«] 

Z Sin V7T 


5.i Sturm-Liouville Problems. 

Orthogonal Functions 

So far we have considered initial value problems. We recall from Sec. 2. 1 that such a problem 
consists of an ODE, say, of second order, and initial conditions y(x 0 ) = K 0i y'( x 0 ) = K x 
referring to the same point (initial point) x = x 0 . We now turn to boundary value problems. 
A boundary value problem consists of an ODE and given boundary conditions referring 
to the two boundary points (endpoints) x — a and x — b of a given interval a ^ x ^ b. 
To solve such a problem means to find a solution of the ODE on the interval a ^ x ^ b 
satisfying the boundary conditions. 

We shall see that Legendre’s, Bessel’s, and other ODEs of importance in engineering 
can be written as a Sturm-Liouville equation 

(1) [p(x)y'Y + [q{x) + A/-(x)]y = 0 


involving a parameter A. The boundary value problem consisting of an ODE (1) and given 

Sturm-Liouville boundary conditions 


( 2 ) 


(a) kiy(a) + k 2 y'(a) = 0 

(b) l x y{b) + l 2 y\b) = 0 


is called a Sturm-Liouville problem. 9 We shall see further that these problems lead to 
useful series developments in terms of particular solutions of (1), (2). Crucial in this 
connection is orthogonality to be discussed later in this section. 

In (1) we make the assumptions that p, q , r, and p are continuous on a S x § b, and 

r(x) >0 b). 

In (2) we assume that k x , k 2 are given constants, not both zero, and so are l x , l 2 , not both 
zero. 


9 JACQUE$ CHARLES FRANCOIS STURM (1803-1855), was born and studied in Switzerland and then 
moved to Paris, where he later became the successor of Poisson in the chair of mechanics at the Sorbonne (the 
University of Paris). 

JOSEPH LIOUVILLE (1 809-1 882), French mathematician and professor in Paris, contributed to various 
fields in mathematics and is particularly known by his important work in complex analysis (Liouville’s theorem: 
Sec. 14.4), special functions, differential geometry, and number theory. 
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EXAMPLE 1 


EXAMPLE 2 


Legendre’s and Bessel’s Equations are Sturm-Liouville Equations 

Legendre's equation (1 — A 2 )y” — Ivy’ + n(n 4 l)y = 0 may be written 

[(1 — a 2 )v’] # 4 Ay = 0 A = /«(/i + I). 

This is (l) with p = 1 - .v 2 . q = 0, and r = I. 

In Bessel's equation 

a 2 v 4- ay -I- (.v 2 — w 2 )y = 0 v = dy/dx, etc. 

as a model in physics or elsewhere, one often likes to have another parameter k in addition to n. For this reason 
we set x = kx. Then by the chain rule y = dy/dx = ( dy/dx ) dx/dx = y'/k. v = y"/k 2 . In the First two terms, k 2 
and k drop out and we get 

x y 4 xy 4 {k x - n )y = 0. 

Division by a* gives the Sturm-Liouville equation 

[a/]’ 4 ^ — 4 Ay j v = 0 

This is (1) with p = a, q = — /i 2 /a, and r = a. 

Eigenfunctions, Eigenvalues 

Clearly, y = 0 is a solution — the “trivial solution” — for any A because ( 1 ) is homogeneous 
and (2) has zeros on the right. This is of no interest. We want to find eigenfunctions y(x), 
that is, solutions of (1) satisfying (2) without being identically zero. We call a number A 
for which an eigenfunction exists an eigenvalue of the Sturm-Liouville problem (1), (2). 

Trigonometric Functions as Eigenfunctions. Vibrating String 

Find the eigenvalues and eigenfunctions of the Sturm-Liouville problem 
(3) y* 4 Ay = 0. v(0) = 0. y(7r) = 0. 

This problem arises, for instance, if an elastic string (a violin string, for example) is stretched a little and then 
fixed at its ends a = 0 and a = tt and allowed to vibrate. Then y(A) is the “space function" of the deflection 
m(.y, t) of the string, assumed in the form m(.y. t) = y(.v)ir(/), where t is lime. (This model will be discussed in 
great detail in Secs. 12.2-12.4.) 

Solution, From (1) and (2) we see that p = 1, q = 0, r = 1 in (1), and a = 0, b = tt, k x = = I, 

*2 = I 2 = 0 in (2). For negative A = — u 2 a general solution of the ODE in (3) isy(.v) = c\e vx 4 c 2 e“' /X . From 
the boundary conditions we obtain = c 2 = 0. so that y = 0, which is not an eigenfunction. For A = 0 die 
situation is similar. For positive A = v 2 a general solution is 

y(A) = A cos ux 4 B sin vx. 

From the first boundary condition we obtain y(0) =4 = 0, The second boundary condition then yields 

y(7r) = B sin vir — 0, thus v = 0. ± 1, ±2, 

For v = 0 we have y = 0. For A = v 2 = 1, 4, 9, 16, • • • , taking B = 1. we obtain 

v(a) = sin vx (v = 1, 2, • • •). 

Hence die eigenvalues of the problem are A = v 2 , where v - 1. 2. • * • . and corresponding eigenfunctions are 
y(.v) = sin where v — 1, 2, • • • . I 

Existence of Eigenvalues 

Eigenvalues of a Sturm-Liouville problem ( 1 ), (2), even infinitely many, exist under rather 
general conditions on p, q, r in (i). (Sufficient are the conditions in Theorem 1, below, 
together with p( x) > 0 and r(x) > 0 on a <x<b. Proofs are complicated; see Ref. [A3] 
or [A1 1] listed in App. 1.) 


= z.2 


A = k 
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Reality of Eigenvalues 

Furthermore, if p 9 q , r, and p in (1) are real-valued and continuous on the interval 
a^x^b and r is positive throughout that interval (or negative throughout that interval), 
then all the eigenvalues of the Sturm-Liouville problem (1), (2) are real. (Proof in 
App. 4.) This is what the engineer would expect since eigenvalues are often related to 
frequencies, energies, or other physical quantities that must be real. 

Orthogonality 

The most remarkable and important property of eigenfunctions of Sturm-Liouville problems 
is their orthogonality, which will be crucial in series developments in terms of eigenfunctions. 


DEFINITION 


Orthogonality 

Functions y x (x), y 2 (x), • * • defined on some interval a ^ x ^ b are called orthogonal 
on this interval with respect to the weight function r(x) > 0 if for all m and all n 
different from m, 


(4) 


/ '•OOymW.VnM dx = 0 (m * n). 


The norm ||v„, || of v,„ is defined by 


(5) 


II frill = 



>'(x)y m z (x) dx . 


Note that this is the square root of the integral in (4) with n — m. 

The functions y l9 y 2 , * • * are called orthonormal on a ^ x ^ b if they are 
orthogonal on this interval and all have norm 1. 

If /*(a*) = 1, we more briefly call the functions orthogonal instead of orthogonal 
with respect to /•( x) = 1; similarly for orthonormality. Then 


r b 

J .VmW v n (A*) dx = 0 (nr =£ n). 


\\y m \\ = 



EXAMPLE 3 Orthogonal Functions. Orthonormal Functions 

The functions y ?u (.v) = sin m.v, m — l, 2. • • • form an orthogonal set on the interval —tt^x= tt. because for 
m it n we obtain by integration [see (l 1) in App. A3.I] 

TT 7T 7T | r* 

I y m (.v)y n (A) dx = I sin nix sin nx dx = — I cos (m — n)x dx - — I cos (m + n)x dx = 0. 

J —TT J — TT ~ J —TT “ J —TT 

The norm ||.y m || equals Vtt, because 

||yj| 2 = J sin 2 nix dx - tt (m = 1, 2, • • •)• 

Hence the corresponding orthonormal set, obtained by division by the norm, is 


sin a* sin 2a 

\flT ’ VtT 


sin 3a 

\^TT 
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THEOREM 1 


PROOF 


Orthogonality of Eigenfunctions 


Orthogonality of Eigenfunctions 

Suppose that the functions p, q, r, and p in the Stimn-Lioiiville equation (1) are 
real-valued and continuous and r(A*) > 0 on the interval a ^ x = b. Let y m {x) and 
v n (.v) be eigenfunctions of the Stunn-Liouville problem ( 1 ), (2) that correspond to 
different eigenvalues k m and A n , respectively . Then y ?n , y n are orthogonal on that 
interval with respect to the weight function r, that is, 

(6) f r(x)y m (x)y n (x) dx = 0 (m ± n). 

J a 

If p(a) = 0, then (2a) can be dropped from the problem . If p(b) = 0, then (2b) 
can be dropped. [It is then required that y and y f remain bounded at such a point, 
and the problem is called singular, as opposed to a regular problem in which (2) 
is used.] 

If p(a) = p(b)> then (2) can be replaced by the “periodic boundary conditions" 

(7) y(a) = y(b\ /to) = /to). 


The boundary value problem consisting of the Sturm-Liouville equation (1) and the 
periodic boundary conditions (7) is called a periodic Sturm-Liouville problem. 

By assumption, y m and y n satisfy the Sturm-Liouville equations 

(pylnY + to + A m r)y m = 0 
(P.Vn)' + to + A n r)y n = 0 

respectively. We multiply the first equation by y n , the second by -y m , and add, 

(Am A n )? y m \ n }'m(Py n) } ? n( AV m) (P^-mX^n] 

where the last equality can be readily verified by performing the indicated differentiation 
of the last expression in brackets. This expression is continuous on a ^ x ^ b since p 
and p r are continuous by assumption and y m , y n are solutions of (1). Integrating over x 
from a to b , we thus obtain 

b 

n) I to < b). 

The expression on the right equals the sum of the subsequent Lines 1 and 2, 

P{b)[y' n (b)y m (b) - y' m (b)y n (bj\ (Line I) 

( 9 ) 

-p( a )[yh(a)y m (a) - yUa)y n W] (Line 2). 

Hence if (9) is zero, (8) with \ m — A n =£ 0 implies the orthogonality (6). Accordingly, 
we have to show that (9) is zero, using the boundary conditions (2) as needed. 


( 8 ) 


<A» 


r b r 

An) J '7«)V dx = piy'rJm - 7W’ 

a L 
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EXAMPLE 4 


EXAMPLE 5 


EXAMPLE 6 


Case 1. p(a ) = p(b) = 0. Clearly. (9) is zero, and (2) is not needed. 

Case 2. p(a) 4^ 0, p(b) = 0. Line 1 of (9) is zero. Consider Line 2. From (2a) we have 

kiy n (a) + k 2 yh(ci) = 0, 

kiyJLa) + k 2 yU<*) = 0. 


Let k 2 4 1 0. We multiply the first equation by y m (a ), the last by —y n (a) and add, 

- yU*)yJa)] = 0. 

This is k 2 times Line 2 of (9), which thus is zero since k 2 # 0. If k 2 = 0, then k^ 4 0 by 
assumption, and the argument of proof is similar. 

Case 3 . p(a) = 0 9 p(b) ^ 0. Line 2 of (9) is zero. From (2b) it follows that Line 1 of (9) 
is zero; this is similar to Case 2. 

Case 4. p(a) 4 1 0 9 p(b) £ 0, We use both (2a) and (2b) and proceed as in Cases 2 and 3. 
Case 5. p(a) = p{b). Then (9) becomes 

P{b)[y' n (b)y m (b) - y' m (b)y n (b) - y' n (a)y m (a) + .v, '„(«)}’„(«)]• 

The expression in brackets [• • •] is zero, either by (2) used as before, or more directly by 
(7). Hence in this case, (7) can be used instead of (2), as claimed. This completes the 
proof of Theorem 1. ■ 

Application of Theorem 1. Vibrating Elastic String 

The ODE in Example 2 is a Sturm-Liouville equation with p = 1, q = 0, and r = 1. From Theorem 1 it follows 
that the eigenfunctions v m = sin nix (m = 1 . 2, • • •) are orthogonal on the interval 0 ^ .v ^ tt. M 


Application of Theorem 1. Orthogonality of the Legendre Polynomials 

Legendre’s equation is a Sturm-Liouville equation (see Example I) 

[(I - ,v 2 )y']' + Ay = 0. A = II (n + 1) 


with p = I — .v 2 . </ = 0. and r = I. Since p(— 1) = p( 1 ) = 0, we need no boundary conditions, but have a 
singular Sturm — Liouville problem on the interval —1 = x = 1. We know that for n = 0, hence 

A = 0, 1 • 2, 2 • 3. • • ■ , the Legendre polynomials P n (x) are solutions of the problem. Hence these are the 
eigenfunctions. From Theorem I it follows that they are orthogonal on that interval, that is. 


(10) 



dx = 0 


(m i= n). ■ 


Application of Theorem 1. Orthogonality of the Bessel Functions J„(x) 

The Bessel function J n (x) with fixed integer n ^ 0 satisfies Bessel’s equation (Sec. 5.5) 

x 2 J n (x) + xj n (x) + (.v 2 - n 2 )J„(x) = 0. 

where j„ = dJ n /dx. j n = d 2 J n tdx 2 In Example 1 we transformed this equation, by setting x = Am, into a 
Sturm-Liouville equation 

[jry'(fcc)]' + (- -^ + * 2 x) j„(kx) = 0 

with p(.v) = ,v. q(x) = —n 2 /x. r(.v) = .v. and parameter A = k 2 . Since p( 0) = 0. Theorem 1 implies orthogonality 
on an interval 0 ^ .v ^ R (R given, fixed) of those solutions J n (kx) that are zero at x = R. that is. 


(ID 


M kR) = 0 


(/i fixed). 
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THEOREM 2 


EXAMPLE 7 


[Note that q(x) = -n 2 /x is discontinuous at 0, but this does not affect the proof of Theorem 1.] It can be shown 
(see Ref. [A 13]) that J n (x) has infinitely many zeros, say, x = l < a n 2 < • • * (see Fig. 107 in Sec. 5.5 for 
n — 0 and I ). Hence we must have 

(12) kR = a n<m thus k nm = a nm /R (m = l T 2, • • •). 

This proves the following orthogonality property. 


Orthogonality of Bessel Functions 

For each fixed nonnegative integer n the sequence of Bessel functions of the first 
kind J n (k n ix), J n (k Ut 2 x), • • • with k n m as in (12 ) forms an orthogonal set on the 
interval 0 = x ^ R with respect to the weight function r(x) = x, that is, 

r R 

(13) xJ n (k nni x)J n (k n iX) dx = 0 (j m, n fixed). 

J o 


Hence we have obtained infinitely many orthogonal sets , each corresponding to one of the fixed values n. This 
also illustrates the importance of the zeros of the Bessel functions. H 

Eigenvalues from Graphs 

Solve the Sturm-Liouville problem y" + Ay = 0, y(0) + y'(0) = 0. y(7r) - y'(7r) = 0. 

Solution . A general solution and its derivative are 

y = A cos kx + B sin kx and y f = —Ak sin kx -f Bk cos kx\ k = VA. 

Tlie first boundary condition gives y(0) + v'(0) = A + Bk = 0, hence A = -Bk. The second boundary condition 
and substitution of A = -Bk give 

y(77) — y'iir) = A cos irk + B sin irk 4- Ak sin irk — Bk cos irk 

= —Bk cos 7 rk + B sin irk - Bk 2 sin irk — Bk cos vk = 0. 

We must have B ^ 0 since otherwise B = A — 0, hence y = 0. which is not an eigenfunction. Division by 
B cos irk gives 

2 -2k 

-k + tan 7 Tk - k tan vk - k = 0, thus tan irk = —z . 

k 2 - I 

The graph in Fig. 1 10 now shows us where to look for eigenvalues. These correspond to the ^-values of the points 
of intersection of tan irk and the right side —2k/(k 2 — l) of the last equation. The eigenvalues are A m = k m 2 . 
where Aq = 0 with eigenfunction yo = 1 and the other A m are located near 2 2 , 3 2 , 4 2 . • • • , with eigenfunctions 
cos k m x and sin k m x. m — 1, 2, • • • . The precise numeric determination of the eigenvalues would require a 
root-finding method (such as those given in Sec. 1 9.2). ■ 



Fig. 110. Example 7. Circles mark the intersections of tan irk and -2k/[k 2 - 1) 
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1. (Proof of Theorem 1) Carry out the details in Cases 
3 and 4. 

2. Normalization of eigenfunctions y m of ( 1 ), (2) means 
that we multiply y w by a nonzero constant c m such that 
c m y m has norm 1 . Show that z m = cy m with any c ^ 0 
is an eigenfunction for the eigenvalue corresponding to 
AW 

3. (Change of jc) Show that if the functions v 0 (a), yi(.v), 

• • • form an orthogonal set on an interval a ^ a ^ b 
(with r(A) = 1). then the functions y 0 (ct 4- k\ y x (ct 4- k\ 

• • • , c > 0, form an orthogonal set on the interval 
(a - k)lc £/£(*- k)lc. 

4. (Change of x) Using Prob. 3, derive the orthogonality 
of 1, cos 7 nr, sin tta, cos 27ta\ sin 27 ta\ on 
— 1 = a = 1 (r(jc) = 1) from that of 1, cos a, sin a\ 
cos 2 a, sin 2 a, ♦ • • on — 7r ^ a 7r. 

5. (Legendre polynomials) Show that the functions 

P n (cos 0), ;i = 0, form an orthogonal set on 

the interval 0 ^ 0 ^ tt with respect to the weight 
function sin 0. 

6. (Tranformation to Sturm-Liouville form) Show that 
y" -f fy 4- (g 4- A h)y = 0 takes the form (1) if you 
set p = exp (// dx). q = pg. r = hp. Why would you 
do such a transformation? 


7-19 1 STURM-LIOUVILLE PROBLEMS 

Write the given ODE in the form (1) if it is in a different 
form. (Use Prob. 6.) Find the eigenvalues and eigenfunctions. 
Verify orthogonality. (Show the details of your work.) 


y( 0) = 0, y(5) = 0 
y'(0) = 0, v'(7T) = 0 

y( 0) = 0. v'(L) = 0 

y( 0) = y(l), v # ( 0) = y'( I) 

v(0) = .v(27r), v(0) = y'(2ir) 

y( 0) + v'(0) = 0, 


7. y" 4- Ay = 0, 

8. y" 4- Ay = 0, 

9. y" 4- Ay = 0, 

10. y" 4- Ay = 0, 

11. y" + Ay = 0. 

12. y" + Ay = 0, 

y(\) + /(D = 0 

13. / + Ay = 0, y(0) = 0. y(l) + y'(l) = 0 

14. (xy')' + AAT -1 y = 0, v(l) = 0, y\e) = 0. 
(Set jc = e l .) 


15. (j rV)' + (A + 1 )a*“ 3 v = 0, y(l ) = 0. 
y (e") = o. (Set a* = e l .) 

16. v" - 2y' + (A + 1 )y = 0. y(0) = 0. 

y(l) = 0 

17. y" + 8y' + (A + 16)y = 0. y(0) = 0, 

y(7r) = 0 


18. xy" + 2y' 4* Ajcy = 0, y(7r) = 0. y(27r) = 0. 

(Use a CAS or set y = jc _1 «.) 


19. y" - 2A*“ 1 y' + ( k 2 + 2jc“ 2 )y = 0 t y(l) = 0 t y(2) = 0. 
(Use a CAS or sety = am.) 

20. TEAM PROJECT. Special Functions. Orthogonal 
polynomials play a great role in applications. For this 
reason. Legendre polynomials and various other 
orthogonal polynomials have been studied extensively; 
see Refs. [GR1], [GR10] in App. I. Consider some of 
the most important ones as follows. 

(a) Chebyshev polynomials 10 of the first and second 
kind are defined by 

7 n (.v) = cos (/i arccos x) 

sin [(/i + 1 ) arccos a] 

Un(x) ~ VT^7 


respectively, where n = 0, 1, • • *. Show that 


7- 0 = 

1. 

Ux) ■■ 

= X , 

T 2 (x) — 

2a- 2 - 

1, 



T 3 (x) 

= 4.v 3 

- 3a, 



Uo = 

1, 


=2x. 

U 2 (x) = 

4a 2 - 

1. 



U 3 (x) 

= 8a 3 

- 4a. 



Show 

that 

the Chebyshev 

polynomials 

T n (x) 

are 


orthogonal on the interval — 1 ^ x ^ 1 wit h respect to 
the weight function r(A) = 1/Vl — a 2 . (Hint. To 
evaluate the integral, set arccos a = 0.) Verify that 
r„(.v). n = 0, 1. 2, 3. satisfy the Chebyshev equation 

(1 - a 2 )v" - jcy' 4- n 2 y = 0. 


(b) Orthogonality on an infinite interval: Laguerre 
polynomials 11 are defined by L 0 = 1, and 


L n (x) = 


n\ 


d n (x n e~ x ) 

dx 11 


n = 1,2, 


Show that 

L t (x) = 1 - a \ Lz(x) = 1 - 2x + x 2 /2. 


L $( x ) = 1 — 3.v + 3 .v 2 /2 — a -3 /6. 


Prove that the Laguerre polynomials are orthogonal on 
the positive axis 0 ^ a < ^ with respect to the weight 
function r(A) = e~ x . Hint. Since the highest power in 
L m is a"*, it suffices to show that J e~ x x k L n dx = 0 for 
k < n. Do this by k integrations by parts. 


10 PAFNUTI CHEBYSHEV (1821-1894). Russian mathematician, is known for his work in approximation 
theory and the theory' of numbers. Another transliteration of the name is TCHEBICHEF. 

11 EDMOND LAGUERRE (1834-1886). French mathematician, who did research work in geometry and in 
the theory of infinite series. 
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CHAP. 5 Series Solutions of ODEs. Special Functions 


5.8 Orthogonal Eigenfunction Expansions 

Orthogonal functions (obtained from Sturm-Liouville problems or otherwise) yield 
important series developments of given functions, as we shall see. This includes the famous 
Fourier series (to which we devote Chaps. 1 1 and 12), the daily bread of the physicist and 
engineer for solving problems in heat conduction, mechanical and electrical vibrations, etc. 
Indeed, orthogonality is one of the most useful ideas ever introduced in applied mathematics. 

Standard Notation for Orthogonality and Orthonormality 

The integral (4) in Sec. 5.7 defining orthogonality is denoted by (y m , y„). This is standard. 
Also, Kronecker’s delta 12 8 mn is defined by S„ m = 0 if m =£ n and 8 mn = 1 if m = n 
(thus 8 nn = 1). Hence for orthonormal functions y 0 , yi> .V 2 > * * * with respect to weight 
r(x) (> 0) on ct = x = b we can now simply write (y w , y n ) = 8„ m , written out 

fO if m =£ n 

( 1 ) .Vn) I ^ M V771W V?^-^) tlx &nm j 

[l if m = n. 


Also, for the norm we can now write 


( 2 ) 


II v|| = VOwJ = 



>'(x)y m 2 (x) dx . 


Write down a few examples of your own, to get used to this practical short notation. 


Orthogonal Series 

Now comes the instant that shows why orthogonality is a fundamental concept. Let 
y 0 , yi, y 2 . * • * be an orthogonal set with respect to weight r(x) on an interval a ^ x ^ b. 
Let f(x) be a function that can be represented by a convergent series 


cc 

(3) /( X) = 2 «m.VmW = «0,'’oW + «l)iW + • ' • ■ 

771 = 0 


This is called an orthogonal expansion or generalized Fourier series. If the y m are 
eigenfunctions of a Sturm-Liouville problem, we call (3) an eigenfunction expansion. In 
(3) we use again m for summation since n will be used as a fixed order of Bessel functions. 

Given /( x), we have to determine the coefficients in (3), called the Fourier constants 
of f(x) with respect to y 0 , yi, • • • . Because of the orthogonality this is simple. All we have 
to do is to multiply both sides of (3) by r(x)y n (x) ( n fixed) and then integrate on both sides 
from a to b . We assume that term-by-term integration is permissible. (This is justified, for 
instance, in the case of “uniform convergence,” as is shown in Sec. 15.5.) Then we obtain 



y n ). 

m-0 


12 LEOPOLD KRONECKER (1823-1891). German matliematician at Berlin University, who made important 
contributions to algebra, group theory, and number theory. 
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EXAMPLE 1 


Because of the orthogonality all the integrals on the right are zero, except when m = n. 
Hence the whole infinite series reduces to the single term 


«n(.Vn,3'n) = ^nll^nll 2 - 


Assuming that all the functions y n have nonzero norm, we can divide by ||y n || 2 ; writing 
again m for n, to be in agreement with (3), we get the desired formula for the Fourier 
constants 


(4) 




(/. y m ) 
\\y m II 2 


IlyJf 


/ >ix)f(x)y m (x) dx 


(m = 0 , 1 , • • •)• 


Fourier Series 

A most important class of eigenfunction expansions is obtained from the periodic Sturm-Liouville problem 

y" + A y = 0, v(7r) = 3>(-7r), v'(7r) = /(- tt). 

A general solution of the ODE is y — A cos kx + B sin kx, where k = VX. Substituting y and its derivative 
into the boundary conditions, vve obtain 

A cos kTr + B sin kir = A cos (-for) + B sin (-for) 

—kA sin kir + kB cos kir = —kA sin (-b tt) + kB cos (— for). 

Since cos ( — a) = cos or, the cosine terms cancel, so that these equations give no condition for these terms. Since 
sin (-a) = -sin a, the equations gives the condition sin kir — 0, hence kir = mrr, k = m = 0, 1, 2, ■ ■ * t so 
that the eigenfunctions are 

cos0=l, cos a, sin a*, cos 2*, sin 2v, ■ • • , cos mx, sin nix, ■ • • 

corresponding pairwise to the eigenvalues A = k 2 = 0, 1 , 4, ■ • • , m 2 , • • * . (sin 0 = 0 is not an eigenfunction.) 
By Theorem 1 in Sec, 5.7, any two of these belonging to different eigenvalues are orthogonal on the interval 
7r ~ x ^ 7r (note that r(x) = 1 for the present ODE). The orthogonality of cos mx and sin mx for the same 
m follows by integration, 

.7 T J - TT 

J cos mx sin mx dx = — j sin 2mx dx — 0. 

For the norms we get || 1 1| = Vfor, and VX for all the others, as you may verify by integrating 1, cos 2 .v, 
sin 2 .v, etc., from This gives the series (with a slight extension of notation since we have two functions 

for each eigenvalue 1, 4, 9 t • * •) 


QO 

(5) f{x) = a 0 + 2 i a m cos mx -f b m sin mx). 

m=l 


According to (4) the coefficients (with m = 1, 2, • • •) are 


( 6 ) 



J /(,)*. 

J —TT 





fix) cos mx dx , 



J fix) sin mx dx. 


The series (5) is called the Fourier series of fix). Its coefficients are called the Fourier coefficients of fix), 
as given by the so-called Euler formulas (6) (not to be confused with the Euler formula (11) in Sec. 2.2). 

For instance, for the “periodic rectangular wave” in Fig. Ill, given by 

f — 1 if — tt < x < 0 

fix) = 

l 1 if 0 < A ’<77 


and /(.v + 2if) = fix). 
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EXAMPLE 2 


we get from (6) the values a 0 = 0 and 


I 

a m = — 
7 T 

1 


J" ( - 1 ) cos mx dx + J 1 • cos mx dxj = 0, 

£ J (— 1 ) sin mx dx 4- J 1 • sin mx dx j 
o 


7T 

1 | cos mx 


■H 


cos mx 


m 


1 


= [1 — 2 cos tmr + 1] = 

TT/ll 


f4/(7rm) ifm=l,3, • 

1 0 lfm-2,4* 


Hence the Fourier series of the periodic rectangular wave is 
fix) 


4/1 I \ 

r (.r) = — I sin.t 4* — sin 3* + -j sin 5,v 4 • • • I . 


fix) 

1 




1 t 

t 

l 

1 

1 

1 

1 

l 

i 


-7T ( 

I 

1 

) n 

I 

2n 

1 

X 


Fig. 111. Periodic rectangular wave in Example 1 


Fourier series are by far the most important eigenfunction expansions, so important to 
the engineer that we shall devote two chapters (1 1 and 12) to them and their applications, 
and discuss numerous examples. 

Did it surprise you that a series of continuous functions (sine functions) can represent 
a discontinuous function? More on this in Chap. 11. 


Fourier-Legendre Series 

A Fourier-Legendre series is an eigenfunction expansion 

oc 

fix) - 2 a mPmix) = ctoPo + CO + ’ * ' = ci 0 + a x x + a 2 (§x 2 - |) + • • 

0 

in terms of Legendre polynomials (Sec. 5.3). The latter are the eigenfunctions of the Sturm-Liouville problem 
in Example 5 of Sec. 5.7 on the interval - 1 ^ x ^ 1 . We have r(.v) = 1 for Legendre’s equation, and (4) gives 


(7) 




2m + 1 
2 


[ fix)P m (x)dx . 


m = 0, 1, • • • 


because the norm is 


( 8 ) 



(m = 0, J, • • ♦) 


as we state without proof. (The proof is tricky; it uses Rodrigues’s formula in Problem Set 5.3 and a reduction 
of the resulting integral to a quotient of gamma functions.) 
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EXAMPLE 3 


For instance, let /(x) = sin ttx. Then we obtain the coefficients 

J . .1 


2mi + 1 f 3 f 3 

a m = — - — I (sin 7 ta‘) P m (x) dx, thus cii = — I x sin ttx dx = — = 0.95493, etc. 

2 J __ J 2 7T 


Hence the Fourier-Legendre series of sin ttx is 


sin ttx = 0.95493/ > 1 (x) - I.15824/> 3 (x) + 0.21429P 5 (x) - 0.01664P 7 (x) + 0.00068P 9 (x) - 0.00002P n (x) + • 

The coefficient of P 13 is about 3 • 10 -7 . Hie sum of the first three nonzero terms gives a curve that practically 
coincides with the sine curve. Can you see why the even-numbered coefficients are zero? Why a 3 is the absolutely 
biggest coefficient? M 

Fourier-Bessel Series 

In Example 6 of Sec. 5.7 we obtained infinitely many orthogonal sets of Bessel functions, one for each of J Qi 
J\* 7 2 , * * ‘ . Each set is orthogonal on an interval 0 .v si R with a fixed positive R of our choice and with 
respect to the weight x. The orthogonal set for J n is 7 n (£ ?ll x). J n (k n , 2 *), */ n (/: n>3 .v) T • • • , where n is fixed and 
kn,m ls given in (12), Sec. 5.7. The corresponding Fourier-Bessel series is 

30 

(9) f(x) = 2 a vi J n( k n,m x ) = «M k n,l x ) + «2-*i t(*n,2-*) + a 3 J n( k n# x ) + ’ * ' (« fixed). 

7H=1 

The coefficients are (with m 


2 f* 

(10) ~ -2 r 2 , .1 x/(x) J n (k n ,mX) dx, /// — 1, 2, 

/r - / n+l( Qf n,?n) * / 0 

because the square of the norm is 

r ^ 

(1 1) IUn(W) II 2 = J o *Jn\k n . , m x) dx = — 4 + l(*n.m*) 

as we state without proof (which is tricky; see the discussion beginning on p. 576 of [A13]). 

For instance, let us consider f(x) = 1 — x 2 and take R - 1 and n = 0 in the series (9), simply writing A for 
chq tin . Then k nrn = of 0m = A = 2.405, 5.520, 8.654, 1 1.792, etc. (use a CAS or Table A1 in App. 5). Next we 
calculate the coefficients a m by (10), 

0,n = , a... [ X(1 - x 2 )J 0 (Xx) dx. 

J 1 (A) 

This can be integrated by a CAS or by formulas as follows. First use [xJ 1 (Ax))' = A.vJ f 0 (Ax) from Theorem 3 
in Sec. 5.5 and then integration by parts, 

«m = TTT- f a-(1 - x 2 )J 0 (\x) dx = (1 - *VA(A*) I’ - \ f xJ x ( Xx)(-2x) dx rl . 

J 1 (A) ^0 Jl (A) L A |o A J 0 J 

The integral-free part is zero. The remaining integral can be evaluated by [x 2 / 2 (Ax)]' = Ax 2 7 1 (A.v) from Theorem 
3 in Sec. 5.5. This gives 

- 4/ * (A) n - i 

” a 2 a 2 (A) (A " 

Numeric values can be obtained from a CAS (or from the table on p. 409 of Ref. [GR1] in App. 1, together 
with the formula J 2 = 2x~ 1 J l - J 0 in Theorem 3 of Sec. 5.5). This gives the eigenfunction expansion of 
1 — x 2 in terms of Bessel functions J Q} that is, 

1 - x 2 = 1.1081/ o (2.405x) - 0.1398J o (5.520x) + 0.0455/ o (8.654x) - 0.02 \0J o (\ 1.792*) + * • • . 

A graph would show that the curve of 1 — x 2 and that of the sum of the first three terms practically coincide. I 
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Mean Square Convergence. 

Completeness of Orthonormal Sets 

The remaining part of this section will give an introduction to a convergence suitable in 
connection with orthogonal series and quite different from the convergence used in 
calculus for Taylor series. 

In practice, one uses only orthonormal sets that consist of ‘‘sufficiently many” functions, 
so that one can represent large classes of functions by a generalized Fourier series (3) — 
certainly all continuous functions on an interval a ^ A' ^ b, but also functions that do “not 
have too many” discontinuities (see Example 1). Such orthonormal sets are called “complete” 
(in the set of functions considered; definition below). For instance, the orthonormal sets 
corresponding to Examples 1-3 are complete in the set of functions continuous on the 
intervals considered (or even in more general sets of functions; see Ref. [GR7], Secs. 3.4-3.7, 
listed in App. 1, where “complete sets” bear the more modern name “total sets”). 

In this connection, convergence is convergence in the norm, also called mean-square 
convergence; that is, a sequence of functions f k is called convergent with the limit f if 

(12*) Jim || f k - /|| = 0; 

k — *-cc 

written out by (2) (where we can drop the square root, as this does not affect the limit) 

r b 

(12) Jim J r(x)[f k (x) - /( a )] 2 dx = 0. 

Accordingly, the series (3) converges and represents / if 

r b 

(13) Jim r(x)[s k (x) - f(x)] 2 dx = 0 
where s k is the &th partial sum of (3). 

k 

(14) -v fc ( x) = 2 a m y m (x). 

?n-0 

By definition, an orthonormal set y 0 , y l9 • • • on an interval a ^ a* ^ b is complete in 
a set of functions S defined on ci ^ x ^ b if we can approximate every / belonging to S 
arbitrarily closely by a linear combination a 0 y 0 + ci 1 y 1 + • • • + a k y k> that is, technically, 
if for every e > 0 we can find constants a 0 , • • • f a k (with k large enough) such that 

(15) 11/ - (flo.Vo + • • • + a k y k ) || < e. 


An interesting and basic consequence of the integral in (13) is obtained as follows. 
Performing the square and using (14), we first have 



The first integral on the right equals 2 a m 2 because / ry ni y t dx = 0 for m * I, and 
/ O'm 2 dx = 1. In the second sum on the right, the integral equals a m , by (4) with 
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THEOREM 1 


PROOF 


EXAMPLE 4 


|| y m || 2 = 1. Hence the first term on the right cancels half of the second term, so that the 
right side reduces to 

- £ « w 2 + 
m= 0 

This is nonnegative because in the previous formula the integrand on the left is nonnegative 
(recall that the weight r(x) is positive!) and so is the integral on the left. This proves the 
important Bessel’s inequality 

(16) 2 ||/|| 2 = 

m- 0 

Here we can let k oo, because the left sides form a monotone increasing sequence that 
is bounded by the right side, so that we have convergence by the familiar Theorem 1 in 
App. A3.3. Hence 

( 17 ) 2 aj S |/* ! . 

m-0 

Furthermore, if y 0 , y lf • • • is complete in a set of functions S, then (13) holds for every 
/ belonging to S. By (15) this implies equality in (16) with k-+ <*. Hence in the case of 
completeness every / in S satisfies the so-called ParsevaPs equality 

«8) 2 «.* - ll/ll 2 - 

m= 0 

As a consequence of (18) we prove that in the case of completeness there is no function 
orthogonal to every function of the orthonormal set, with the trivial exception of a function 
of zero norm: 



J r(x)f(x) 2 dx (* = 1, 2, • • •)• 



Completeness 

Let y Q , y lf • * • be a complete orthonormal set on a ^ x ^ b in a set of functions S. 
Then if a function f belongs to S and is orthogonal to every y m , it must have norm 
zero. In particular , iff is continuous, then f must be identically zero. 


Since f is orthogonal to every y m , the left side of (18) must be zero. If f is continuous, 
then || /|| =0 implies /( x) = 0, as can be seen directly from (2) with / instead of y m 
because r( x) >0. ■ 

Fourier Series 

The orthonormal set in Example 1 is complete in the set of continuous functions on -ir^ x ^ tt. Verify directly 
that /(. r) = 0 is the only continuous function orthogonal to all the functions of that set. 

Solution. Lef / be any continuous function. By the orthogonality (we can omit V27rand W), 
f 1 • fix) dx =0. f fix) cos //a- dx = 0. f f(.x) sin mx dx = 0. 


Hence a m - 0 and b m = 0 in (6) for all m y so that (3) reduces to fix) = 0. 
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This is the end of Chap. 5 on the power series method and the Frobenius method, which 
are indispensable in solving linear ODEs with variable coefficients, some of the most 
important of which we have discussed and solved. We have also seen that the latter are 
important sources of special functions having orthogonality properties that make them 
suitable for orthogonal series representations of given functions. 





[HT| FOURIER-LEGENDRE SERIES 

Showing die details of your calculations, develop: 
1. 7 jc 4 - 6jc 2 2. (x + l) 2 

3. x 3 — x 2 + x — 1 4. 1, x , x 2 , * 3 


5. Prove that if fix) in Example 2 is even [that is, 
/(*) = /(—*)], its series contains only P m (x) with 
even m. 


6-16 


CAS EXPERIMENTS. FOURIER-LEGENDRE 
SERIES 


Find and graph (on common axes) the partial sums up to 
that S mo whose graph practically coincides with that of fix) 
within graphical accuracy. State what m 0 is. On what does 
the size of m 0 seem to depend? 

6. f(x) = sin 7tx 7. /(*) = sin 2ttx 

8. fix) = cos ttx 9. fix) — cos 27 TX 


10. f(x) = cos 3 irx 11. fix ) = e* 

12. f(x) = e~ x2 13. f(x) = 1/(1 + jc 2 ) 

14. fix) = / 0 (<*o t i*)> where a Q1 is the first positive zero 
of J 0 

15. fix) — J 0 ia 0f2 x), where a 02 is the second positive 
zero of J 0 

16. fix) = J 1 ia ltl x) i where a ltl is the first positive zero 
of J l 


17. CAS EXPERIMENT. Fourier-Bessel Series. Use 
Example 3 and again take n — 10 and R = 1, so that 
you get the series 


on the speed of convergence by observing the decrease 
of the coefficients. 

(c) Take fix) — 1 in (19) and evaluate the integrals 
for the coefficients analytically by (24a), Sec. 5.5, with 
v — 1. Graph the first few partial sums on common 
axes. 

18. TEAM PROJECT. Orthogonality on the Entire 
Real Axis. Hermite Polynomials. 13 These orthogonal 
polynomials are defined by He 0 (1) = 1 and 

He n ix) = (-1)V= 2/2 (e-**' 2 ), n = 1, 2, • • • . 

REMARK. As is true for many special functions, the 
literature contains more than one notation, and one 
sometimes defines as Hermite polynomials the 
functions 

H 0 * = 1, H n *ix) = i- 1)V* . 

This differs from our definition, which is preferred in 
applications. 

(a) Small Values of n. Show that 

He x i jc) = x t He 2 ix) = x 2 — 1, 

He z ix) = x 3 - 3x, He 4 ix) = x 4 - 6a: 2 + 3. 

(b) Generating Function. A generating function of 
the Hermite polynomials is 


(19) fix) = aM^x) + a 2 Joioto&c) + a 3 J 0 ia 0>3 x) 

+ • • ■ 

with the zeros a o fl ot 0 & * • * from your CAS (see also 
Table A 1 in App. 5). 

(a) Graph the terms Joicto^x), ■ ■ ■ , 7 0 (oo,io*) for 
0 = at ~ 1 on common axes. 

(b) Write a program for calculating partial sums of 
(19). Find out for what fix) your CAS can evaluate the 
integrals. Take two such fix) and comment empirically 


(20) e*-* 2 ' 2 = 2 On(x)t n 

n-0 

because He n {x) — nla n ix). Prove this. Hint: Use the 
formula for the coefficients of a Maclaurin series and 
note that tx — |f 2 = |a 2 - §(x — t) 2 . 

(c) Derivative. Differentiating the generating function 
with respect to x , show that 

(21) He’ n ix) = nHe n ^ix). 


13 CHARLES HERMITE (1822-1901), French mathematician, is known for his work in algebra and number 
theory. The great HENRI POINCARE (1854-1912) was one of his students. 
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(d) Orthogonality on the Jc-Axis needs a weight 
function that goes to zero sufficiently fast as x — » ±°°. 
(Why?) Show that the Hermite polynomials are 
orthogonal on — & < x < » with respect to the weight 
function r(jt) = e ~^ /z . Hint. Use integration by parts 
and (21). 

(e) ODEs. Show that 

(22) Hento = *He n (x) - He n+1 (x). 

Using this with n — 1 instead of n and (21), show that 
y = He n (x) satisfies the ODE 


(23) y” - xy f 4* ny = 0. 

Show that w = e~ x2/4 y is a solution of Weber’s 
equation 14 

(24) w" + (n - h £ - \x 2 )w = 0 (n = 0, 1, • • •)• 

19. WRITING PROJECT. Orthogonality. Write a short 
report (2-3 pages) about the most important ideas and 
facts related to orthogonality and orthogonal series and 
their applications. 


: 55585 18S 5 55 s 




TIONS AND PROBLEMS 


1. What is a power series? Can it contain negative or 
fractional powers? How would you test for convergence? 

2. Why could we use the power series method for 
Legendre’s equation but needed the Frobenius method 
for Bessel’s equation? 

3. Why did we introduce two kinds of Bessel functions, 
J and y? 

4. What is the hypergeometric equation and why did Gauss 
introduce it? 

5. List the three cases of the Frobenius method, giving 
examples of your own. 

6. What is the difference between an initial value problem 
and a boundary value problem? 

7. What does orthogonality of functions mean and how is 
it used in series expansions? Give examples. 

8. What is the Sturm-Liouville theory and its practical 
importance? 

9. What do you remember about the orthogonality of the 
Legendre polynomials? Of Bessel functions? 

10. What is completeness of orthogonal sets? Why is it 
important? 


11-20 


SERIES SOLUTIONS 


Find a basis of solutions. Try to identify the series as 
expansions of known functions. (Show the details of your 
work.) 


11. y" - 9y = 0 

12. (1 - x) 2 y" + (1 - x)/ - 3y = 0 

13. x y" — (x + 1)/ + y = 0 

14. x 2 /' - 3 xy‘ + 4y = 0 

15. y" + 4xy ' + (4x 2 + 2)y = 0 

16. x z y" — 4 xy' + ( x 2 + 6)y = 0 

17. xy" + (2x + !)>•' + (a- + L)y = 0 


18. (a 2 - 1)/' - 2a/ + 2y = 0 

19. (a 2 - l)y" + 4a/ + 2y = 0 

20. x 2 /' + x/ + (4 a 4 - !)>• = 0 


21-25 j BESSEL’S EQUATION 

Find a general solution in terms of Bessel functions. (Use 
the indicated transformations and show the details.) 

21. x 2 /' + xy 1 + (36a 2 - 2 )>• = 0 (6a = z) 

22. a 2 /' + 5x/ + (a 2 - 12)y = 0 ()> = m/a 2 ) 

23. x 2 /' + x/ + 4(x 4 - 1)3- = 0 (a 2 = z) 

24. 4x 2 /' - 20x/ + (4x 2 + 35)y = 0 (3 - = x 3 m) 

25. y" + k 2 x 2 y = 0 (>• = mVx, \kx 2 = z) 


26-30 BOUNDARY VALUE PROBLEMS 


Find the eigenvalues and eigenfunctions. 

26. y" + Ay = 0, y(0) = 0, y ; ( 7r) = 0 

27. 3-" + Av = 0, 3-(0) = 3-(l), 

y'(0) = /(i) 

28. (xy')' + Aa - 1 3> = 0, 3>(1) = 0, y(e) = 0. 

(Set x = e l .) 

29. x 2 y" + xy' + (Ax 2 - l)y = 0, 

3-(0) = 0, 3'd) = 0 

30. y" + \y = 0, 3-(0) + 3-'(0) = 0, 3’ (2 77) = 0 


1 3 1 —35 1 CAS PROBLEMS 

Write a program, develop in a Fourier-Legendre series, and 
graph the first five partial sums on common axes, together 
with the given function. Comment on accuracy. 

31. C 2 * (-1 = A* = 1 ) 

32. sin (ttx 2 ) (-1 ^ x ^ 1) 

33. 1/(1 + |jc|) ( 1 ~ a* = 1) 

34. | COS TTX I (—1 ^ x ^ 1) 

35. if 0 = a* = 1, 0 if -1 ^ x < 0 


14 HEINRICH WEBER (1842-1913), German mathematician. 
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CHAP. 5 Series Solutions of ODEs. Special Functions 


SUMMARY OF CHARTER-5 

Series Solution of ODEs. Special Functions 


The power series method gives solutions of linear ODEs 

(1) .v" + p(x)y' + q(x)y = 0 

with variable coefficients p and q in the form of a power series (with any center 
.v 0 , e.g., x 0 = 0) 

00 

(2) V’(.v) = 2 a m (x - x 0 ) m = a 0 + cixix - x 0 ) + a 2 (x - x 0 f + • • • . 

m—0 

Such a solution is obtained by substituting (2) and its derivatives into (1). This gives 
a recurrence formula for the coefficients. You may program this formula (or even 
obtain and graph the whole solution) on your CAS. 

If p and q are analytic at a 0 (that is, representable by a power series in powers 
of a* — A* 0 with positive radius of convergence; Sec. 5.2), then ( 1 ) has solutions of 
this form (2). The same holds if /!, /?, q in 


/7(A)y" + p(x)y f 4- q{x)y = 0 


are analytic at a 0 and h(x 0 ) ^ 0, so that we can divide by li and obtain the standard 
form (1). Legendre’s equation is solved by the power series method in Sec. 5.3. 
The Frobenius method (Sec. 5.4) extends the power series method to ODEs 


(3) 


/' + 


a(x) , 


+ 


b(x) 

(x - x 0 f ' 


= 0 


whose coefficients are singular (i.e., not analytic) at a 0 , but are “not too bad,” 
namely, such that a and b are analytic at x 0 . Then (3) has at least one solution of 
the form 


(4) ,V(A-) = (AT - A' 0 ) r 2 - Xo) m = a o(X - X 0 f + a^X ~ X 0 ) r+1 + • • • 

? n .=0 

where r can be any real (or even complex) number and is determined by substituting 
(4) into (3) from the indicial equation (Sec. 5.4), along with the coefficients of (4). 
A second linearly independent solution of (3) may be of a similar form (with different 
/• and a m * s) or may involve a logarithmic term. Bessel’s equation is solved by the 
Frobenius method in Secs. 5.5 and 5.6. 

“Special functions” is a common name for higher functions, as opposed to the 
usual functions of calculus. Most of them arise either as nonelementary integrals 
[see (24)-(44) in App. 3. 1 ] or as solutions of ( 1 ) or (3). They get a name and notation 
and are included in the usual CASs if they are important in application or in theory. 




Summary of Chapter 5 


219 


Of this kind, and particularly useful to the engineer and physicist, are Legendre’s 
equation and polynomials (Sec. 5.3), Gauss’s hypergeometric 

equation and functions F(a 9 b 9 c; x) (Sec. 5.4), and Bessel’s equation and 
functions J v and Y v (Secs. 5.5, 5.6). 

Modeling involving ODEs usually leads to initial value problems (Chaps. 1-3) 
or boundary value problems. Many of the latter can be written in the form of 
Sturm-Liouville problems (Sec. 5.7). These are eigenvalue problems involving 
a parameter A that is often related to frequencies, energies, or other physical 
quantities. Solutions of Sturm-Liouville problems, called eigenfunctions, have 
many general properties in common, notably the highly important orthogonality 
(Sec. 5.7), which is useful in eigenfunction expansions (Sec. 5.8) in terms of cosine 
and sine (“ Fourier series”, the topic of Chap. 11), Legendre polynomials, Bessel 
functions (Sec. 5.8), and other eigenfunctions. 





CHAPTER 6 

Laplace Transforms 


The Laplace transform method is a powerful method for solving linear ODEs and 
corresponding initial value problems, as well as systems of ODEs arising in engineering. 
The process of solution consists of three steps (see Fig. 1 12). 

Step 1. The given ODE is transformed into an algebraic equation (“subsidiary 
equation”)* 

Step 2. The subsidiary equation is solved by purely algebraic manipulations. 

Step 3. The solution in Step 2 is transformed back, resulting in the solution of the given 
problem. 



Fig. ID. Solving an I VP by Laplace transforms 


Thus solving an ODE is reduced to an algebraic problem (plus those transformations). 
This switching from calculus to algebra is called operational calculus. The Laplace 
transform method is the most important operational method to the engineer. This method 
has two main advantages over the usual methods of Chaps. 1-4: 

A. Problems are solved more directly, initial value problems without first determining 
a general solution, and nonhomogeneous ODEs without first solving the corresponding 
homogeneous ODE. 

B. More importantly, the use of the unit step function (Heaviside function in 
Sec. 6.3) and Dirac’s delta (in Sec. 6.4) make the method particularly powerful for 
problems with inputs (driving forces) that have discontinuities or represent short impulses 
or complicated periodic functions. 

In this chapter we consider the Laplace transform and its application to engineering 
problems involving ODEs. PDEs will be solved by the Laplace transform in Sec. 12.11. 

General formulas are listed in Sec. 6.8, transforms and inverses in Sec. 6.9. The 
usual CASs can handle most Laplace transforms. 

Prerequisite: Chap. 2 

Sections that may be omitted in a shorter course : 6.5, 6.7 

References and Answers to Problems : App. 1 Part A, App. 2. 
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6.1 Laplace Transform. Inverse Transform. 
Linearity. s-Shifting 

If /(/) is a function defined for all t ^ 0, its Laplace transform 1 is the integral of f(t) 
times e~ st from t = 0 to °o. It is a function of s , say, F(s), and is denoted by ££(/); thus 

(1) F(s) = £(/) = J e~ st f(t) dt. 

J o 

Here we must assume that f(t ) is such that the integral exists (that is, has some finite 
value). This assumption is usually satisfied in applications — we shall discuss this near the 
end of the section. 

Not only is the result F(s) called the Laplace transform, but the operation just described, 
which yields F(s) from a given /(r), is also called the Laplace transform. It is an “integral 
transform” 

F(s) = \ k(s , t)f(t) dt 
J o 

with “kernel” k(s> t ) = e~ st . 

Furthermore, the given function f{t) in (1) is called the inverse transform of F(s) and 
is denoted by $£“\F); that is, we shall write 

(i*) m = <e-\n. 

Note that (1) and (1*) together imply $£~ x (SZ(f)) = / and £(£~ l (F)) = F. 

Notation 

Original functions depend on / and their transforms on s — keep this in mind! Original 
functions are denoted by lowercase letters and their transforms by the same letters in 
capital, so that F(s) denotes the transform of /(f), and Y(s) denotes the transform of y(f), 
and so on. 

EXAMPLE 1 Laplace Transform 

Lei f(t) = L when t ^ 0. Find F{s). 

Solution . From (I) we obtain by integration 

£(/) = £( 1 )= f e-^dt 
J o 

1 PIERRE SIMON MARQUIS DE LAPLACE (1749-1827), great French mathematician, was a professor in 
Paris. He developed the foundation of potential theory and made important contributions to celestial mechanics, 
astronomy in general, special functions, and probability theory. Napoleon Bonaparte was his student for a year. 
For Laplace’s interesting political involvements, see Ref. [GR2J, listed in App. 1. 

The powerful practical Laplace transform techniques were developed over a century later by the English 
electrical engineer OLIVER HEAVISIDE (1850-1925) and were often called “Heaviside calculus.” 

We shall drop variables when this simplifies formulas without causing confusion. For instance, in (1) we 
wrote <£(/) instead of £(/)(*) and in (1*) $~\F) instead of 2T\F)(t). 
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EXAMPLE 2 


THEOREM 1 


PROOF 


EXAMPLE 3 


Our notation is convenient, but we should say a word about it. The interval of integration in (I) is infinite. 
Such an integral is called an improper integral and. by definition, is evaluated according to the rule 


[ e~ st f(i)di= lim f e~ st f(l) dt. 
T-*o c J n 


Hence our convenient notation means 


$t dt = lim 

- - e~ s> 

T 

= lim 

T— oc 

s 

o T -°° 


We shall use this notation throughout this chapter. 

Laplace Transform ££(< e **) of the Exponential Function e°* 

Let f(/) = e at when / ^ 0, where a is a constant. Find ££(/). 
Solution . Again by ( 1 ), 

_oc 

X(e at ) = f e~ st e at dt = — — 

J 0 a - s 

hence, when s — a > 0, 


is > 0). 


y!(e at ) = 


Must we go on in this fashion and obtain the transform of one function after another 
directly from the definition? The answer is no. And the reason is that new transforms can 
be found from known ones by the use of the many general properties of the Laplace 
transform. Above all, the Laplace transform is a “linear operation,” just as differentiation 
and integration. By this we mean the following. 


Linearity of the Laplace Transform 

The Laplace transform is a linear operation; that is, for any functions fit) and g(t) whose 
transforms exist and any constants a and b the transform of a fit) + bg{t) exists , and 

i£{af(t) + bg(t)) = aSE[m) + b<£[g(t)). 


By the definition in (1), 

f°° 

itlam + bgii)) = e~ st [afit) + bgit)] dt 
J o 

cc 00 

= a e- st fi t) dt + b\ e~ si git) dt = aX[fit)} + b%{git ) }. ■ 

■'o J o 

Application of Theorem 1: Hyperbolic Functions 

Find the transforms of cosh at and sinh at. 

Solution . Since cosh at = %(e at + e ~ at ) and sinh at = %(e at - e ~ at ), we obtain from Example 2 and 
Theorem ! 

#(cosli at) = \me at ) + X{e~ at )) = ) = ■ ' 

2 2 \s - a s + a ) s 2 - a 2, 

<£(sinh at) = { me at ) - X(e~ at )) = U — -J— ) = a . ■ 

2 2 \ s — a s + a ) s 2 - a 2 
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EXAMPLE 


Cosine and Sine 

Derive the formulas 


o w 

5£(cos cot ) = —5 5 - , 3?($in ft}/) = —5 5 - 

r + a) 2 r + <w 2 


Solution by Calculus. We write L c = ££(cos and L s = ££(sin ft}/). Integrating by parts and noting that the 
integral-free parts give no contribution from the upper limit we obtain 


J f &~ St | 00 ^ r 

I e~ st cos (titdt = cos ft}/ I £»" sf sii 

0 ~ s \o s J Q 

J f 00 e~~ st 00 w f 00 

I <?“ st sin u>t dt — sin (tit -I- — I e~ st 

0 0 s J o 


1 a) 

sin (titdt = L x , 

s s s 


cos (titdt = — L-. 
s 

By substituting L s into the formula for L c on the right and then by substituting L c into the formula for L s on 
the right, we obtain 


1 

ft} j 

f 0) \ 

/ ft} 2 \ 

1 

= — - 


\-L c ] , 

yi + T = 


s 

s ' 


\ s 2 / 

5 

(ti 

/I 


/ ft> 2 \ 

co 

s 


• 7 Ls ) ’ 

M 1 + if) = 

7 

sforms Using Derivatives . 

See next section. 





4 " * 2 + <o 2 • 


Solution by Complex Methods . In Example 2, if we set a = iw with i = V-T, we obtain 


SE(e iwt ) = 


1 


s + io) 


s + /(W 


-I- / • 


s — it 0 (s — i(ti)(s + it i>) s 2 - 1 - a} 2 s 2 + ft} 2 ' 5 2 + ft} 2 

Now by Theorem I and e ltot = cos cot + / sin «}/ [see (II) in Sec. 2.2 with (ot instead of /] we have 


ZE(e iait ) — S£(cos (ot 4- / sin cot) = S?(cos &>/) 4- iX( sin ft}/). 


If we equate the real and imaginary parts of this and the previous equation, the result follows. (This formal 
calculation can be justified in the theory of complex integration.) M 


Basic transforms are listed in Table 6.1. We shall see that from these almost all the others 
can be obtained by the use of the general properties of the Laplace transform. Formulas 
1-3 are special cases of formula 4, which is proved by induction. Indeed, it is true for 
n = 0 because of Example 1 and 0! = 1. We make the induction hypothesis that it holds 
for any integer n ^ 0 and then get it for n 4- 1 directly from (1). Indeed, integration by 
parts first gives 


r*® 1 

2(f n+1 ) = e~ st t n+1 dt = - - e ~ st t n+1 

Jn a 


n + 1 

S 


\ e~ st t n dt. 
J o 


Now the integral-free part is zero and the last part is (n + 1 )fs times ££(/ n ). From this 
and the induction hypothesis, 


2 (,"«) = i±i son = JL + J 


,n + 1 


(n + 1)! 

s n+2 


This proves formula 4. 
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Table 6.1 Some Functions /(t) and Their Laplace Transforms i£(/) 



m 

m) 

1 

1 

1 Is 

2 

t 

Ms 2 

3 


2 i/j 3 

4 

s N 

£ ^ 
II 

n! 

s «+i 

5 

t a 

(a positive) 

Ha + 1) 
s° +1 

6 

e at 

1 

s — a 



m 

2(f) 

1 

cos cot 

s 

s 2 + co 2 

8 

sin cot 

CO 

s 2 + CO 2 

9 

cosh at 

s 

s 2 — a 2 

10 

sinh at 

a 

s 2 — a 2 

11 

e at cos cot 

s — a 

is - a) 2 + co 2 

12 

e at sin cot 

CO 

is — a) 2 + co 2 


r(tf + 1) in formula 5 is the so-called gamma function [(15) in Sec. 5.5 or (24) in 
App. A3.1]. We get formula 5 from (1), setting st = x : 

r°° r°° / x\ a dx l r°° 

%•) = l .-*!•* = J[ «-* (7) T = 7TT l 

where 5 > 0. The last integral is precisely that defining T(tf + 1), so we have 
T(a -1- l)/s a+1 , as claimed. (CAUTION! T(fl + 1) has x a in the integral, not A* a+1 .) 
Note the formula 4 also follows from 5 because T(;t 4- 1) = n\ for integer n ^ 0. 
Formulas 6-10 were proved in Examples 2-4. Formulas 11 and 12 will follow from 7 
and 8 by “shifting,” to which we turn next. 

s-Shifting: Replacing s by s — a in the Transform 

The Laplace transform has the very useful property that if we know the transform of fit), 
we can immediately get that of e at f(t\ as follows. 


THEOREM 2 


First Shifting Theorem, s-Shifting 

If fit) has the transform F(s) ( where s > kfor some k), then e at f(t) has the transform 
F(s — a) ( where s — a > k). In formulas, 

%{e at m) = F(s - a) 

or, if we take the inverse on both sides, 

e at f{f) = %-HFis - a)}. 
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PROOF 


EXAMPLE 5 


We obtain F(s — a) by replacing s with s — a in the integral in (1), so that 

oo 00 

F(s - a) = e~ ($ - a}t f(t)dt = e~ st [e at f(tj] dt = 2{e at f(t)}. 

J o J o 

If F(s) exists (i.e., is finite) for s greater than some k, then our first integral exists for 
s — a> k. Now take the inverse on both sides of this formula to obtain the second formula 
in the theorem. (CAUTION! —a in F(s — a) but +a in e at f(t).) ■ 

s-Shifting: Damped Vibrations. Completing the Square 

From Example 4 and die first shifting theorem we immediately obtain formulas 11 and 12 in Table 6.1, 

s - a 


$£{e at cos <ot] = 


(s - a) 2 + <o 2 


$£{e at sin cot) = 


(5 — a ) 2 + a ) 2 


For instance, use these formulas to find the inverse of the transform 

3js - 137 

m ~ s 2 + 2s + 401 ’ 

Solution . Applying the inverse transform, using its linearity (Prob. 28), and completing the square, we obtain 


= f 3(,-M)-140 | _ r ,tl 1 _ r 20_ 

f 1 (j + l) 2 + 400 1 1 (.s + l) 2 + 20 2 J 1 (i + l) 2 + 

We now see that the inverse of the right side is the damped vibration (Fig. 1 13) 

/(/) = *"*(3 cos 20/ - 7 sin 20/). 


20 * 



Fig. 113. Vibrations in Example 5 


Existence and Uniqueness of Laplace Transforms 

This is not a big practical problem because in most cases we can check the solution of 
an ODE without too much trouble. Nevertheless we should be aware of some basic facts. 

A function /(f) has a Laplace transform if it does not grow too fast, say, if for all 
f S 0 and some constants M and k it satisfies the “growth restriction” 


( 2 ) 


|/(f)| § Me kt . 
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(The growth restriction (2) is sometimes called “growth of exponential order,’’ which may 
be misleading since it hides that the exponent must be kt , not kt 2 or similar.) 

/(/) need not be continuous, but it should not be too bad. The technical term (generally 
used in mathematics) is piecewise continuity. f(t) is piecewise continuous on a finite interval 
a ^ t where f is defined, if this interval can be divided into finitely many subintervals 
in each of which / is continuous and has finite limits as t approaches either endpoint of such 
a subinterval from the interior. This then gives finite jumps as in Fig. 1 14 as the only possible 
discontinuities, but this suffices in most applications, and so does the following theorem. 





a \ b t 


Fig. 114. Example of a piecewise continuous function f(t). 
(The dots mark the function values at the jumps.) 


THEOREM 3 


Existence Theorem for Laplace Transforms 

If f(t) is defined and piecewise continuous on every finite interval on the semi-axis 
t = 0 and satisfies (2) for all t ^ 0 and some constants M and k, then the Laplace 
transform £{f) exists for all s > k. 


PROOF Since f(t) is piecewise continuous, e~ st f(t) is integrable over any finite interval on the 
/-axis. From (2), assuming that s > k (to be needed for the existence of the last of the 
following integrals), we obtain the proof of the existence of ££(/) from 



e~ st f( t) dt 




M 

s - k 


Note that (2) can be readily checked. For instance, cosh / < e\ t n < tile* (because t n ln\ 
is a single term of the Maclaurin series), and so on. A function that does not satisfy (2) 
for any M and k is e t2 (take logarithms to see it). We mention that the conditions in 
Theorem 3 are sufficient rather than necessary (see Prob. 22). 


Uniqueness. If the Laplace transform of a given function exists, it is uniquely 
determined. Conversely, it can be shown that if two functions (both defined on the positive 
real axis) have the same transform, these functions cannot differ over an interval of positive 
length, although they may differ at isolated points (see Ref. [A 1 4] in App. 1). Hence we 
may say that the inverse of a given transform is essentially unique. In particular, if two 
continuous functions have the same transform, they are completely identical. 


s=et- 


1-20 


LAPLACE TRANSFORMS 


Find the Laplace transforms of the following functions. 
Show the details of your work, (a, b, k, 0are constants.) 

1 * t 2 - 2 / 2 . (/ 2 - 3) 2 


3. COS 27Tt 
5. e 2t cosh / 

7. cos (cut + 6) 

g* ^3a-2br. 


4. sin 2 4/ 

6. e~ l sinh 5/ 
8. sin (3 1 — \) 
10. -8 sin 0.2/ 
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28. (Inverse transform) Prove that X 1 is linear. Hint. 
Use the fact that X is linear. 


29-40 


INVERSE LAPLACE TRANSFORMS 

Given F(s) = X(f), find f(t). Show the details. (L, n , k y ci % 
b are constants.) 


29. 

31. 

33. 

35. 

37. 

39. 


4s — 3tt 


5 4 - 3a 2 4- 12 

5 5 

mrL 

L 2 s 2 + n 2 * 1 
8 

s 2 4- 4a* 

1 

(s - V3 )(s 4- V5) 

1 1 

s 2 4- 5 5 4- 5 


30. 

32. 

34. 


25 4- 16 
.v 2 - 16 
10 

2a- 4- V2 

20 

(5 — 1)(5 + 4) 


36. 2 


(* + l) 2 

s + k 2 


k=* 1 

18.v - 12 

38- -7T2 T 

9a- 2 - 1 


40. 


1 


(5 4- a)(s 4- b) 


21. Using X(f) in Prob. 13, find X(fi), where /j(/) = 0 if 
/ ^ 2 and f x (t) = 1 if t > 2. 

22 . (Existence) Show that ££(1/Vir) = V77/5. [Use 
(30) r(|) = Vtt in App. 3. 1 .] Conclude from this that 
the conditions in Theorem 3 are sufficient but not 
necessary for the existence of a Laplace transform. 

23. (Change of scale) If X(f(t)) = F(s) and c is any 
positive constant, show that X(f(ct)) = F(s!c)!c. (Hint: 
Use (1).) Use this to obtain 56(cos cot ) from «S£(cos /). 

24. (Nonexistence) Show that e * 2 does not satisfy a 
condition of the form (2). 

25. (Nonexistence) Give simple examples of functions 
(defined for all x ^ 0) that have no Laplace transform. 

26. (Table 6.1) Derive formula 6 from formulas 9 and 10. 

27. (Table 6.1) Convert Table 6.1 from a table for finding 
transforms to a table for finding inverse transforms (with 
obvious changes, e.g., X~~ l (Hs n ) = / M ~V(n - 1)!, etc.). 


1 4 1-54 1 APPLICATIONS OF THE FIRST SHIFTING 
THEOREM (s-SHIFTING) 

In Probs. 41-46 find the transform. In Probs. 47-54 find 
the inverse transform. Show the details. 


41. 

OJ 

bo 

CO 

£ 

42. 

-3t 4 e- 0!i ' 

43. 

5e~ at sin cot 

44. 

e~ 3t cos irt 

45. 

e~ kt (ci cos t 4- b sin t) 



46. 

e- l (a 0 4- a j/ 4- • • • 4- 

u n t 1 

n ) 

47. 

7 

48. 

TT 

(.V - l) 3 

(S + TT) 2 

49. 

V8 

50. 

s — 6 

(.7 + V2) 3 

(.7 -l) 2 + 4 

51. 

15 

52. 

4.7 - 2 

5 2 4- 4a 4- 29 

.7 2 - 6.7 + 18 

53. 

77 

54. 

2a* ~ 56 

A* 2 4- 10 775 4- 24 7 T 2 

i 2 - 4.7 - 12 


6.2 Transforms of Derivatives and Integrals. 
ODEs 


The Laplace transform is a method of solving ODEs and initial value problems. The crucial 
idea is that operations of calculus on functions are replaced by operations of algebra 
on transforms . Roughly, differentiation of f(t) will correspond to multiplication of i 6(f) 
by 5 (see Theorems 1 and 2) and integration of f(t) to division of X(f) by s. To solve 
ODEs, we must first consider the Laplace transform of derivatives. 
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THEOREM 1 


PROOF 


THEOREM 2 


EXAMPLE 1 


Laplace Transform of Derivatives 

The transforms of the first and second derivatives of f(t) satisfy 

(1) m') = sx(f) - m 

(2) £(/") = S 2 £(f) - Sf( 0) - /'( 0). 

Formula ( 1 ) holds iff(t) is continuous for all / ^ 0 and satisfies the growth restriction 
(2) in Sec. 6.1 and f\t ) is piecewise continuous on every finite interval on the semi- 
axis t 0. Similarly, (2) holds if f and f f are continuous for all t^ 0 and satisfy 
the growth restriction and f is piecewise continuous on every finite interval on the 
semi-axis t 0. 


We prove (1) first under the additional assumption that /' is continuous. Then by the 
definition and integration by parts, 

oc oc 

e- si f(t)dt = [e-«m] +s 
o 

Since / satisfies (2) in Sec. 6. 1, the integrated part on the right is zero at the upper limit 
when s> k y and at the lower limit it contributes — /( 0). The last integral is i£(/). It exists 
for s> k because of Theorem 3 in Sec. 6.1. Hence i£(/') exists when s > k and (1) holds. 

If f is merely piecewise continuous, the proof is similar. In this case the interval of 
integration of f must be broken up into parts such that f is continuous in each such part. 
The proof of (2) now follows by applying (1) to f" and then substituting (1), that is 

££(/") = sZ{f ) - f (0) = s[s%(f) - /(0)] = s 2 i£(/) - 5/(0) - /'( 0). ■ 

Continuing by substitution as in the proof of (2) and using induction, we obtain the 
following extension of Theorem 1 . 


X(f') = f 


\ e- st f(t)dt. 


Laplace Transform of the Derivative / (n) of Any Order 

Let /, jf\ * • • , / (n “ 1) be continuous for all r ^ 0 and satisfy the growth restnction 

(2) in Sec . 6.1 . Furthermore , let f n) be piecewise continuous on every finite interval 
on the semi-axis t ^ 0. Then the transform of f n) satisfies 

(3) = s n X(f) - s n - x f( 0) - s n ~ 2 f'( 0) f n ~ ly ( 0). 


Transform of a Resonance Term (Sec. 2.8) 

Let f(t) = t sin tot. Then /( 0) = 0. /'(/) = sin tot + ot cos tot, f( 0) = 0. f = 2to cos tot — to 2 t sin tot. Hence 
by (2), 


nf) = 2 <o - 2 * 2 - o> 2 X{f) = s 2 $(f), 
s + or 


thus 


X(f) = sin wt) — 


2tos 

(s 2 + to 2 ) 2 * " 
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EXAMPLE 2 


THEOREM 3 


PROOF 


EXAMPLE 3 


Formulas 7 and 8 in Table 6.1, Sec. 6.1 

This is a third derivation of £(cos <oi) and if (sin cot); cf. Example 4 in Sec. 6.1. Let /(/) = cos cot. Then 
/(0) = 1, /'(0) = 0, f"(t) = -co 2 cos cot. From this and (2) we obtain 

SE(f") = s 2 $(J) - s = -co 2 X(f). By algebra. S6(cos (oi) = ^ 2 ^2 • 

Similarly, let g = sin cot . Then g(0) = 0, g = <o cos <ot. From this and (1) we obtain 

co c o 

X(g ) = s$£(g) = o>5£(cos <ot). Hence ££(sin o>t ) = — Z£(cos <ot) = 2 + — 2 * ■ 

Laplace Transform of the Integral of a Function 

Differentiation and integration are inverse operations, and so are multiplication and division. 
Since differentiation of a function /(f) (roughly) corresponds to multiplication of its 
transform ££(/) by s , we expect integration of /(f) to correspond to division of ££(/) by s : 


Laplace Transform of Integral 

Let F(s) denote the transform of a function /(f) which is piecewise continuous for 
t ^ 0 and satisfies a growth restriction (2), Sec. 6.1. Then, for s > 0, s > k 7 and 
t > 0, 




\ r‘ 1 

1 


r* J 

f 1 1 

(4) 

X- 


* = -m 

s 

thus 

H 

I 

Si 

II 

b 

'P 

0 

ltH 


Denote the integral in (4) by gif). Since /(f) is piecewise continuous, g(t) is continuous, 
and (2), Sec. 6.1, gives 


|g(0| = 


£/( t) dr S £ |/(r)| drfsM £e fcr dr = y(e fct - 1) Si ye fct (k > 0 ). 


M 


This shows that g(f) also satisfies a growth restriction. Also, g'(t) = /(f), except at points 
at which /(f) is discontinuous. Hence g{t) is piecewise continuous on each finite interval 
and, by Theorem 1, since g( 0) = 0 (the integral from 0 to 0 is zero) 


2{/(f)} = %{g'{t)} = s£{g{t ) } - g(0) = sX{gf)}. 


Division by s and interchange of the left and right sides gives the first formula in (4), 
from which the second follows by taking the inverse transform on both sides. ■ 


Application of Theorem 3: Formulas 19 and 20 in the Table of Sec. 6.9 

1 1 

Using Theorem 3, find the inverse of — 5 5- and g g 0“ - 

s(s + or) s(s + or) 

Solution, From Table 6. 1 in Sec. 6. 1 and the integration in (4) (second formula with the sides interchanged) 


we obtain 



sin cot 




1 

s(s 2 + (O 2 ) 



sin (or 

(O 


1 

dr = — «-(l — cos (ot). 

co 
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This is formula 19 in Sec. 6.9. Integrating this result again and using (4) as before, we obtain formula 20 in 
Sec. 6.9: 


,f 1 1 1 F r sinwrl* / sin tot 

% I 2/ 2 , 27 I = 2 I 0 ~ cos <OT ) dr — 2 3 = ~ 3 • 

{ s (s -I- <0 ) J (o J o L o> » J 0 or 

It is typical that results such as these can be found in several ways. In this example, try partial fraction 
reduction. ■ 


Differential Equations, Initial Value Problems 

We shall now discuss how the Laplace transform method solves ODEs and initial value 
problems. We consider an initial value problem 

(5) y” 4 ay 9 + by = /*(/), .v(0) = K 0 , y'( 0) = AT, 


where a and b are constant. Here r(t) is the given input (driving force) applied to the 
mechanical or electrical system and y(t) is the output (response to the input) to be obtained. 
In Laplace’s method we do three steps: 

Step 1. Setting up the subsidiary equation. This is an algebraic equation for the transform 
Y = X(y) obtained by transforming (5) by means of (1) and (2), namely, 

[s 2 Y - sy( 0) - /( 0)] 4- a[sY - y(0)] + bY = R(s) 

where R(s) = X(r). Collecting the /-terms, we have the subsidiary equation 

C? 2 + as 4- b)Y = (s + fl)y(O) 4- y'(O) -I- R(s). 

Step 2. Solution of the subsidiary equation by algebra . We divide by s 2 4 as 4- b and 
use the so-called transfer function 

^ s 2 + as 4 b (s 4 \a ) 2 4 b — \a 2 

(Q is often denoted by H y but we need H much more frequently for other purposes.) This 
gives the solution 

( 7 ) Y(s) = [(s 4 a)y( 0 ) 4 y'(0)]Q(s) 4 R(s)Q(s). 

If y( 0) = y ; (0) = 0, this is simply / = RQ\ hence 

_ Y _ iC(output) 

R iC(input) 

and this explains the name of Q. Note that Q depends neither on r(t) nor on the initial 
conditions (but only on a and b). 

Step 3. Inversion ofY to obtain y = t£~ x (Y). We reduce (7) (usually by partial fractions 
as in calculus) to a sum of terms whose inverses can be found from the tables (e.g., in 
Sec. 6.1 or Sec. 6.9) or by a CAS, so that we obtain the solution y(t) = ££~\Y) of (5). 
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EXAMPLE 4 Initial Value Problem: The Basic Laplace Steps 

Solve 

/' — y — t, ><0) = i, /(0) = i. 

Solution . Step 1. From (2) and Table 6. 1 we get the subsidiary equation [with / = ££(/)] 

s 2 Y - sy( 0) - /(0) - Y = I/* 2 , thus Cs 2 - \)Y = a* + 1 + l/.v 2 . 

Step 2. The transfer function is Q = U(s 2 - 1), and (7) becomes 

I s 4- 1 1 

Y = (s + ])Q+ Q = j - + _ ~ • 

Simplification and partial fraction expansion gives 



Step 3 From this expression for Y and Table 6.1 we obtain the solution 

y(>) = ST\Y) = = e * + sinh ' - »■ 

The diagram in Fig. 1 15 summarizes our approach. 


i-space s-space 



Fig. 115. Laplace transform method 


EXAMPLE 5 Comparison with the Usual Method 

Solve the initial value problem 

y" + y + 9y = 0, y(0) = 0.16, >>'(0) = 0. 

Solution. From (1) and (2) we see that the subsidiary equation is 

s 2 Y- 0.16s + sY- 0.16 + 9Y = 0, thus (s 2 + 5 + 9)Y = O.I6(s + I). 


The solution is 

_ Q.I6(s + 1) _ 0.16(.v + |) + 0.08 
Y ~ s 2 + s + 9 Cs + ±) 2 + f 

Hence by the first shifting theorem and the formulas lor cos and sin in Table 6. 1 we obtain 
y(l) = <E~\Y) = e -t/Z (o.l6 cos sin 

= e~°- st (0A6 cos 2.96 1 + 0.027 sin 2.96/). 


This agrees with Example 2, Case (ITT) in Sec. 2.4. The work was less. 
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Advantages of the Laplace Method 

1. Solving a nonhomogeneous ODE does not require first solving the 
homogeneous ODE. See Example 4. 

2. Initial values are automatically taken care of. See Examples 4 and 5. 

3. Complicated inputs r[t) (right sides of linear ODEs) can be handled very 
efficiently , as we show in the next sections. 


EXAMPLE 6 Shifted Data Problems 

This means initial value problems with initial conditions given at some / = t 0 > 0 instead of i = 0. For such 
a problem set / = 7 + r 0 » so that r = l o g* ves t = 0 and the Laplace transform can be applied. For instance, 
solve 

y" + y = 2u _y(j77) = \tt, /(ij7r) - 2 — V2. 

Solution . We have r 0 = \i r and we set / = 7 + Then the problem is 

y" + y = 2(7 + |tt), m = y'(0) = 2 - V2 

where y(r) = y(t). Using (2) and Table 6.1 and denoting the transform of y by F, we see that the subsidiary 
equation of the “shifted” initial value problem is 

s 2 Y -s-hir- (2- V2) + Y = \ + — , thus (s 2 + \)Y = \ + — + ]ttts + 2 - Vl. 

s s s s 2 

Solving this algebraically for F, we obtain 

_ 2 \'TT 2 — V2 

(s 2 + 1 )s 2 (s 2 + 1)5 s 2 + 1 s 2 + 1 

The inverse of the first two terms can be seen from Example 3 (with co = 1), and the last two terms give cos 
and sin, 

y = %-\Y) = 2(7— sin?) + ^7r(l — cos?) + \ tt cos 7 + (2 — V2) sin? 

= 2? + — V2 sin 7. 

Now t = t — 57t, sin / = (sin t - cos /), so that the answer (the solution) is 

y = 2/ — sin / + cos /. H 


PROBLEM SET 6.2 


1-8 


OBTAINING TRANSFORMS BY 
DIFFERENTIATION 


Using (1) or (2), find £(/) if /(f) equals: 
1. te kt 2. t cos 5t 

3. sin 2 o)t A — 2 

5. sinh 2 at 
7. t sin rt 


4. cos 2 *JTt 
6. cosh 2 |r 

8. sin 4 t (Use Prob. 3.) 


9. (Derivation by different methods) It is typical that 
various transforms can be obtained by several methods. 
Show this for Prob. 1. Show it for ££(cos 2 §/) (a) by 


expressing cos 2 |/ in terms of cos/, (b) by using 
Prob. 3. 


10-24 


INITIAL VALUE PROBLEMS 


Solve the following initial value problems by the Laplace 
transform. (If necessary, use partial fraction expansion as 
in Example 4. Show all details.) 


10. y' + 4y = 0, y(0) = 2.8 

11. y' + \y = 17 sin 2t, y( 0) = -1 

12. y" - y' - 6 y = 0, y(0) = 6, 

/(0) = 13 
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13. y" - fy = 0. y(0) = 4, /(0) = 0 

14. y" - 4 y' + Ay = 0, j>(0) = 2.1, 

/(0) = 3.9 

15. y" + 2y' + 2y = 0, )-(0) = 1, 

y'(0) = -3 

16. y" + ky' - 2 k 2 y = 0, y(0) = 2, 

/(O) = 2k 

17. y" + 7/ + \2y = 21e 3£ , )>(0) = 3.5, 

y'(0) = -10 

18. y" + 9y = 10<r £ , y(0) = 0, /(O) = 0 

19. y" + 3y' + 2.25 .y = 9f 3 + 64, ^(0) = 1, 

>-'(0) = 31.5 

20. y" - 6y' + 5y = 29 cos 2f, y(0) = 3.2, 

y'(0) = 6.2 

21. (Shifted data) y' - 6y = 0, y(2) = 4 

22. y" - 2y - 3y = 0, y(i) = -3, 

/(l) = -17 

23. y" + 3/ - 4y = 6e zt - 2 , y(D = 4, 

>'( 1 ) = 5 

24. y" + 2y' + 5v = 50 f - 150, y( 3) = -4, 

/(3) = 14 

25. PROJECT. Comments on Sec. 6.2. (a) Give reasons 
why Theorems 1 and 2 are more important than 
Theorem 3. 

(b) Extend Theorem 1 by showing that if f(t) is 
continuous, except for an ordinary discontinuity (finite 
jump) at some t = a (> 0), the other conditions 
remaining as in Theorem 1, then (see Fig. 116) 

(1*) X(f') = s£(f) - f(0) - [/(« + 0) - f(a - 0)]*-^. 

(c) Verify (1*) for f(t) = e~ l if 0 < t < 1 and 0 if 
t > 1. 

(d) Verify (1*) for two more complicated functions of 
your choice. 

(e) Compare the Laplace transfonn of solving ODEs 
with the method in Chap. 2. Give examples of your 


own to illustrate the advantages of the present method 
(to the extent we have seen them so far). 



Fig. 116. Formula (1*) 

26. PROJECT. Further Results by Differentiation. 
Proceeding as in Example 1, obtain 

s 2 - a? 

(a) X(t COS tot) = (J 2 + ^2)2 

and from this and Example 1: (b) formula 21, (c) 22, 
(d) 23 in Sec. 6.9, 

s 2 + a 2 

(e) cosh at) = ■ 2 _ ~ 2 f > 

las 

(f) sinh at) = 2 _ . 


1 27-34 


OBTAINING TRANSFORMS BY 
INTEGRATION 


Using Theorem 3, find f(t) if 5£(/) equals: 

10 


27. 

1 

28. 

i 2 + s!2 

29. 

1 

30. 

s 3 - ks 2 

31. 

5 

32. 

— 5s 

33. 

1 

34. 

* 4 - 4s 2 


1 


s 3 H- 9s 

1 

* 4 + t rV 


35. (Partial fractions) Solve Probs. 27, 29, and 31 by 
using partial fractions. 


6.: Unit Step Function. t-Shifting 

This section and the next one are extremely important because we shall now reach the point 
where the Laplace transform method shows its real power in applications and its superiority 
over the classical approach of Chap. 2. The reason is that we shall introduce two auxiliary 
functions, the unit step function or Heaviside function u(t — a) (below) and Dirac 's delta 
8{t — a) (in Sec. 6.4). These functions are suitable for solving ODEs with complicated 
right sides of considerable engineering interest, such as single waves, inputs (driving forces) 
that are discontinuous or act for some time only, periodic inputs more general than just 
cosine and sine, or impulsive forces acting for an instant (hammerblows, for example). 
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Unit Step Function (Heaviside Function) u(t — a) 

The unit step function or Heaviside function u(t — a) is 0 for t < a, has a jump of size 
1 at t = a (where we can leave it undefined), and is 1 for t > a , in a formula: 


( 1 ) 


u(t - a) 


0 

1 


if t < a 
if t> a 


(a ^ 0). 


Figure 1 17 shows the special case u(t\ which has its jump at zero, and Fig. 1 18 the general 
case u(t — a) for an arbitrary positive a . (For Heaviside see Sec. 6.1.) 

The transform of u(t — a) follows directly from the defining integral in Sec. 6.1, 


oo oc 

X{ u{t - a)} = J e- st u(t - a) dt = J <T st * 1 dt = - 




I t~a 


here the integration begins at t - a (^ 0) because u(t - a) is 0 for t < a. Hence 


( 2 ) 


£{u(t - a)} = — 
s 


(s > 0). 


The unit step function is a typical “engineering function” made to measure for 
engineering applications, which often involve functions (mechanical or electrical 
driving forces) that are either “off 5 or “on.” Multiplying functions /(f) with u(t — a), 
we can produce all sorts of effects. The simple basic idea is illustrated in Figs. 1 19 
and 120. In Fig. 119 the given function is shown in (A). In (B) it is switched off 
between t = 0 and t = 2 (because u(t — 2) = 0 when t < 2) and is switched on 
beginning at t = 2. In (C) it is shifted to the right by 2 units, say, for instance, by 2 secs, 
so that it begins 2 secs later in the same fashion as before. More generally we have the 
following. 

Let f{t) = 0 for all negative t Then f(t — a)u(t — a) with a > 0 is /( t) shifted 
(< translated ) to the right by the amount a. 

Figure 120 shows the effect of many unit step functions, three of them in (A) and 
infinitely many in (B) when continued periodically to the right; this is the effect of a 
rectifier that clips off the negative half-waves of a sinuosidal voltage. CAUTION! Make 
sure that you fully understand these figures, in particular the difference between parts (B) 
and (C) of Figure 1 1 9. Figure 1 1 9(C) will be applied next. 


u(t) 

1 

0 t 

Fig. 117. Unit step function u(t) 


u(t — a) 

1 - 


0 a t 

Fig. 118. Unit step function u(t - a) 
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(A) fit) = 5 sin t (B) fit)uit - 2) <C) fit - 2 )uit - 2) 

Fig. 119. Effects of the unit step function: (A) Given function. 
(B) Switching off and on. (C) Shift. 



(A) k[uit — 1) - 2 uit - 4) + uit - 6)1 (B) 4 sin (^nt)[u{t) - uit - 2) + uit - 4) - + •••] 

Fig. 120. Use of many unit step functions. 


Time Shifting (t-Shifting): Replacing t by t — a in f(t) 

The first shifting theorem (“^-shifting”) in Sec. 6.1 concerned transforms F(s ) = ££{f(t)j 
and F(s — a) = ££{e at f(t) }. The second shifting theorem will concern functions f(t) and 
/(/ — a). Unit step functions are just tools, and the theorem will be needed to apply them 
in connection with any other functions. 

THEOREM 1 Second Shifting Theorem; Time Shifting 

If f(t) has the transform F(s ), then the “shifted function" 

f 0 if t < a 

(3) f(t) = f(t - a)u(t - a) = \ 

l /(/ — a) if t > a 

has the transform e~ as F(s). That is, if $£{f(t)} = F(s), then 

(4) X{f(t - a)u(t - a)} = e~^F(s). 

Or, if we take the inverse on both sides, we can write 
(4*) f(t - a) u{t - a) = X~ l {e-^F{s)}. 


Practically speaking, if we know F(s), we can obtain the transform of (3) by multiplying 
F(s) by <? -as . In Fig. 119, the transform of 5 sin t is F(s) = 5/(s 2 + 1), hence the shifted 
function 5 sin ( t - 2) u(t - 2) shown in Fig. 119(C) has the transform 


e-^Fis) = 5e-* s Hs 2 + 1 ). 
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PROOF 


EXAMPLE 1 


We prove Theorem 1. In (4) on the right we use the definition of the Laplace transform, 
writing rfor t (to have t available later). Then, taking e~ as inside the integral, we have 

00 00 

e~°*F(s) = f e~ ST f(i) dr - [ e~ s(r + a) f(r) dr. 

J o J o 

Substituting r 4- a = /, thus r = t — a, dr = dt, in the integral (CAUTION, the lower limit 
changes!), we obtain 

f°° 

e-^Fis) = J e~ $t f(t - a ) dt. 


To make the right side into a Laplace transform, we must have an integral from 0 to <», 
not from a to But this is easy. We multiply the integrand by u(t — a ). Then for t from 
0 to a the integrand is 0, and we can write, with J as in (3), 

e-^Fis) = *-*/(/ - a)u(t - a) dt = e~ st f{f) dt. 

J o J o 

(Do you now see why u(t — a) appears?) This integral is the left side of (4), the Laplace 
transform of f(t) in (3). This completes the proof. ■ 

Application of Theorem 1. Use of Unit Step Functions 

Write the following function using unit step functions and find its transform. 


/</) = 


’2 if 0 < / < 1 

\i 2 if 1 < i < \ir 

cos t if t > kir. 


(Fig. 121) 


Solution . Step 1. In terms of unit step functions, 

f(t) = 2(1 - u(f - 1)) + - 1) - u(t - |ir)) + (cos/)//(f - %tt). 

Indeed, 2(1 — w(/ - I)) gives f(t) for 0 < t < 1, and so on. 

Step 2. To apply Theorem 1, we must write each term in /(f) in the form /(r - a)u(t - a). Thus, 2(1 - u{t - 1)) 
remains as it is and gives the transform 2(1 — e“ s )/.v. Then 

- 1)] = a ( - l) 2 + (/ - 1) + y)«(/ - ») = (7 + 7 + i) 

ce{i, 2 „(, _ I*)] = £g{I (, - j*) + f (' - {«) + T" )"(' “ \")} 

£e{( C °s 0«(/ - j*)} - ifij— (sin - y «■))«(/ - i.)j = - 77J-* - "*. 


Together, 

2(f) 


■ 7 - 7 *- ■ * (7 * 7 ♦ i)'~ - (7 + i? * £)•— - 777 *"”' 
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EXAMPLE 2 


EXAMPLE 3 


If the conversion of /(/) to f(i — a) is inconvenient, replace it by 

(4**) - a)) = + a)}. 

(4**) follows from (4) by writing /(/ — a) = g(t), hence f(t) = g(t + a) and then again writing / for g. Thus, 
- 1)} = + l) 2 } = + f + y} = «-(£ + £ + £) 

as before. Similarly for ££{|/ 2 w(/ - 577 )}. Finally, by (4**), 


^jcosf u[t — — 7rjJ = e '"sf 2 £gjcos ^/ + — 7rjj = e rrSl2 X[— sin r} = —e 


-7T$/2 . 


S*+\ 



Application of Both Shifting Theorems. Inverse Transform 

Find the inverse transform /(/) of 

F(s) = 771 ? + 771 ? + V+W • 

Solution . Without the exponential functions in the numerator the three terms of F(s) would have the inverses 
(sin 7 r/)/ 7 r, (sin 7 T/)/ 7 r, and te" 2 * because Us 2 has the inverse /, so that 1 /(s + 2) 2 has the inverse te~ 2t by the 
first shifting theorem in Sec. 6.1. Hence by the second shifting theorem (/-shifting), 

/(/) = — sin (t T{t - 1 )) uit - 1 ) + — sin (t r(t - 2)) «(/ - 2) + (/ - 3)e“ 2a “ 3> «(/ - 3). 

7T 77 

Now sin (77/ — 77) = —sin 77/ and sin (77/ — 277) = sin 77/, so that the second and third terms cancel each other 
when / > 2. Hence we obtain /(/) = 0 if 0 < / < 1 , -(sin 7 r/)/ 7 rif 1 < / < 2, 0 if 2 < / < 3, and (/ — 3)e“ 2<c “ 3> 
if/ >3. See Fig. 122. ■ 



Response of an RC-Circuit to a Single Rectangular Wave 

Find the current /(/) in the tfC-circuit in Fig. 123 if a single rectangular wave with voltage V 0 is applied. The 
circuit is assumed to be quiescent before the wave is applied. 
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EXAMPLE 4 




Fig. 123. RC- circuit, electromotive force v(f), and current in Example 3 


Solution. The input is Vq[u(i — a) - u(t — />)]. Hence the circuit is modeled by the integro-differential 
equation (see Sec. 2.9 and Fig. 123) 

m + 77 = WD + 7 J o Hr) dr = i>(/) = V 0 [u(r -a)- u(t - b )]. 

Using Theorem 3 in Sec. 6.2 and formula (I) in this section, we obtain the subsidiary equation 

ms) + ^ = -7- [«-“ - <rn 
Solving this equation algebraically for /($), we get 


I(s) = FWe-™ - e-**) 


where 


FU) = 


V a /R 


and 


X~\F) = ^ r" <RO 
K 


s + i/(£Q 

the last expression being obtained from Table 6.1 in Sec. 6.1. Hence Theorem I yields the solution (Fig. 123) 

Hr) = 3t\d = - <r 1 {<r iM, F(s)} = [,-«-«/<ro m( , - a) - *-«-»/« ®„ ( , _ w ] ; 

A 


that is, /(/) = 0 if t < n, and 


M = 


*!<?■ 


-tiCRC) 


if n < / < 6 


l(*i - K 2 )e~ tKRC) if a> b 
where K t = V 0 e anRO IR and K 2 = V 0 e b,(RC) /R. ■ 

Response of an RiC-Circuit to a Sinusoidal Input Acting Over a Time Interval 

Find the response (the current) of the /?LC-circuit in Fig. 124, where E(t) is sinusoidal, acting for a short time 
interval only, say. 


£(/) = 100 sin 400f if 0 < / < 2tt 


and 


£(/) = 0 if / > 2 tt 


and current and charge are initially zero. 

Solution. The electromotive force £(/) can be represented by (100 sin400/)(I - u{t - 2n r)). Hence the 
model for the current i(i) in the circuit is the integro-differential equation (see Sec. 2.9) 

/( 0 ) = 0 , /'( 0 ) = 0 . 


0 . 1 /' + 1 1 / + 100 [ Hr) dr = (100 sin 400/)(l - «(/ - 2 rr)), 
J o 


From Theorems 2 and 3 in Sec. 6.2 we obtain the subsidiary equation for I(s) = %(i) 

-2rrS \ 


„ l 100-4005 /I e~ 2/n *\ 

0.15/ +117+100- = -a 5 - ( . 

s 5 2 + 400 2 \s s ) 
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Solving it algebraically and noting that s 2 + 110$ + 1000 = ($ + 10)($ + 100), we obtain 

1000-400 / ^ «~ 2w * \ 

/(s) ~ (s + 10)(i + 100) \i 2 + 400 2 ~ s 2 + 400 2 / ‘ 

For the first term in the parentheses (* • •) times the factor in front of them we use the partial fraction expansion 

400 000s A B Ds + K 

(s + 10)(s + 100)(i 2 + 400 2 ) ~ j + 10 + s + 100 + J 2 + 400 2 ' 

Now determine A, B , D, K by your favorite method or by a CAS or as follows. Multiplication by the common 
denominator gives 

400 000 J = A(.t + 100)(s 2 + 400 2 ) + B(s + 10)(s 2 + 400 2 ) + (Ds + K)(s + 10)(s + 100). 

We set s = — 10 and - 100 and then equate the sums of the s 3 and s 2 terms to zero, obtaining (all 
values rounded) 


t S - -10) 

(. s = - 100 ) 

($ 3 - terms) 
($ 2 -terms) 


-4 000 000 = 90C10 2 + 400 2 )A. 
-40 000 000 = -90(1 00 2 + 400 2 )#, 
0 ' = A + B + D, 

0= 100A + 10 B + HOD 




A = -0.27760 


B = 2.6144 
D = -2.3368 


K = 258.66. 


Since K = 258.66 = 0.6467 • 400, we thus obtain for the first term I x in / = / x — / 2 


0,2776 2.6144 2.3368$ 0.6467 - 400 

s + 10 + s *f 100 s 2 + 400 2 * s 2 + 400 2 


From Table 6.1 in Sec. 6.1 we see that its inverse is 

hit) = -0.2776<T 1O£ + 2.6144<T 100£ - 2.3368 cos400r + 0.6467 sin400r. 

This is the current /(/) when 0 < t < 2 tt> It agrees for 0 < t < 2 tt with that in Example 1 of Sec. 2.9 (except 
for notation), which concerned the same flLC-circuit. Its graph in Fig. 62 in Sec. 2.9 shows that the exponential 
terms decrease very rapidly. Note that the present amount of work was substantially less. 

The second term /j of / differs from the first term by the factor e~ 2rrS . Since cos 400(/ - 2tt) = cos 400r 
and sin 400(r - 27 t) = sin 400/, the second shifting theorem (Theorem 1) gives the inverse / 2 (0 = 0 if 
0 < / < 2 7T, and for > 27r it gives 

/ 2 (0 = -0.2776e -lo<t_2,r) + 2.6144 <r 100<£ - 2,rt - 2.3368 cos 400/ + 0.6467 sin 400/. 

Hence in /(f) the cosine and sine terms cancel, and the current for / > 2 tt is 
/(/) = — 0.2776 (<?“ 10 £ - + 2.6144 (<?“ loot - 

It goes to zero very rapidly, practically within 0.5 sec. ■ 


C= 10 -2 F 



E(t) 

Fig. 124. RLC- circuit in Example 4 
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1. WRITING PROJECT. Shifting Theorem. Explain 
and compare the different roles of the two shifting 
theorems, using your own fonnulations and examples. 


2-13 


UNIT STEP FUNCTION AND SECOND 
SHIFTING THEOREM 


Sketch or graph the given function (which is assumed to 
be zero outside the given interval). Represent it using unit 
step functions. Find its transform. Show the details of your 
work. 

2. t (0 < t < 1) 3. e l (0 < t < 2) 

4. sin 3/ (0 < t < n) 5. t 2 (1 < t < 2) 

6. t 2 (t > 3) 7. cos t rt (1 < t < 4) 

8 . 1 - e~ l (0 < t < 77 -) 9. / (5 < t < 10) 

10. sin a>t ( t > 6ttI<o) 11. 20 cos irt (3 < t < 6) 

12. sinh / (0 < / < 2) 13. e 77 ' (2 < r < 4) 


14-22 


INVERSE TRANSFORMS BY THE 
SECOND SHIFTING THEOREM 


Find and sketch or graph /(r) if X(f) equals: 

14. se~ s /(s 2 4* (a 2 ) 

15. e~* s /s 2 

16. j- 2 - (s’ 2 4 s-^e- 8 

17. (e~ 2rrs - e~** s )!(s 2 4 1) 

18. e-"*/(s z 4 2j 4 2) 19. e~ 2s ls 5 

20. (1 - e~ s+k )/(s - k) 21. se- 3s f(s 2 - 4) 

22. 2.5(e~ 3 8s - e~ 2 6s )/s 


23-34 


INITIAL VALUE PROBLEMS, SOME WITH 
DISCONTINUOUS INPUTS 


Using the Laplace transform and showing the details, solve: 

23. y" + 2/ + 2 y = 0. y(0) = 0, 

>-'(0) = I 

24. 9 y" - 6y' + y = 0, y(0) = 3, 

/(0) = 1 

25. y" + 4y' + 13y = 145 cos 2 r, y(0) = 10, 
y'(0) = 14 

26. y" + 10 y' + 24y = 144? 2 , y(0) = % 

y'(0) = -5 

27. y" + 9 y = r(t). r(t ) = 8 sin t if 0 < t < it and 0 

if t > it; y(0) = 0, y f (0) = 4 

28. y" + 3 y' + 2y = r(/), r(/) = 1 if 0 < / < 1 and 

0 if / > 1 ; y(0) = 0, /(0) = 0 

29. y" + y = /•(/), r(/) = / if 0 < t < 1 and 0 if 

1 > 1; y(0) = y'(0) = 0 


30. y" - 16y = r{t), r(t) = 48e 2 ' if 0 < t < 4 and 

0 if / > 4; y(0) = 3, y'(0) = -4 

31. y" + y' — 2 y = r(t), r(t) = 3 sin t - cos t if 

0 < 1 < Its and 3 sin 2 1 — cos 2t if t > 2ir‘, 
y(0) = 1 , /( 0 ) = 0 

32. y" + 8/ + 15y = r(r), r(r) = 35e 2t if 

0 < / < 2 and 0 if t > 2; y(0) = 3, 

y'(0) = -8 

33. (Shifted data) y" + 4y = 8/ 2 if 0 < t < 5 and 0 
if / > 5; y(l) = 1 + cos 2, y f (l) = 4 — 2 sin 2 

34. y" + 2y' +5 y = 10 sin / if 0 < i < 2 it and 0 if 
f > 2 it; y(ir) = 1, y'(7r) = 2 e—* - 2 

MODELS OF ELECTRIC CIRCUITS 

35. (Discharge) Using the Laplace transform, find the 
charge q(t) on the capacitor of capacitance C in Fig. 125 
if the capacitor is charged so that its potential is V 0 and 
the switch is closed at / = 0. 



Fig. 125. Problem 35 

1 36-38 1 RC-CIRCUIT 

Using the Laplace transform and showing the details, find 
the current i(t) in the circuit in Fig. 126 with R= 10 fl and 
C = 10~ 2 F, where the current at t = 0 is assumed to be 
zero, and: 

36. v(t) = 100 V if 0.5 </< 0.6 and 0 otherwise. 
Why does i(t) have jumps? 

37. v = 0 if / < 2 and 100(; - 2) V if / > 2 

38. v = 0 if / < 4 and 14 • 10 6 <r 3< V if / > 4 



Fig. 126. Problems 36-38 


39-41 


RL-CIRCUIT 


Using the Laplace transform and showing the details, find 
the current i(t) in the circuit in Fig. 127, assuming /( 0) = 0 
and: 
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39. R = 10 n, L = 0.5 H, v = 200/ V if 0 < / < 2 and 
0 if / > 2 

40. R = 1 kfl (= 1000 ft), L = I H, v = 0 if 
0 < t < w, and 40 sin t V if t > v 

41. R = 25 a, L = 0.1 H, v = 490<T 5 ‘ V if 
0 < / < 1 and 0 if t > 1 



v(i) 

Fig. 127. Problems 39-41 


|42-44| LC-CIRCUIT 

Using the Laplace transform and showing the details, find 
the current i(t) in the circuit in Fig. 128, assuming zero 
initial current and charge on the capacitor and: 

42. L = 1 H, C = 0.25 F, v = 200(/ - %t 3 ) V if 
0 < t < 1 and 0 if / > 1 

43. L = 1 H, C = 10" 2 F, v = -9900 cos / V if 
7r < t < 3 tt and 0 otherwise 

44. L = 0.5 H, C = 0.05 F, v = 78 sin r V if 
0 < / < 7r and 0 if / >77 



v(t) 


Fig. 128. Problems 42-44 
45-47 1 RLC-CIRCUIT 

Using the Laplace transform and showing the details, find 
the current i(t) in the circuit in Fig. 129, assuming zero 
initial current and charge and: 

45. R = 2 L = 1 H, C = 0.5 F, v(t) = 1 kV if 
0 < / < 2 and 0 if t > 2 

46. R = 4 L = 1 H, C = 0.05 F, v = 34e“ fc V 
if 0 < / < 4 and 0 if t > 4 

47. R = 2 a, L = 1 H, C « 0.1 F, v = 255 sin / V 
if 0 < / < 277 and 0 if / > 277 


C 



vU) 


Fig. 129. Problems 45-47 


6.4 Short Impulses. Dirac's Delta Function. 

Partial Fractions 

Phenomena of an impulsive nature, such as the action of forces or voltages over short 
intervals of time, arise in various applications, for instance, if a mechanical system is hit 
by a hammerblow, an airplane makes a “hard” landing, a ship is hit by a single high wave, 
or we hit a tennisball by a racket, and so on. Our goal is to show how such problems are 
modeled by “Dirac’s delta function” and can be solved very efficiently by the Laplace 
transform. 

To model situations of that type, we consider the function 

1 Ik if a ^ t ^ a + k 

(1) hit - a) = (Fig. 130) 

. 0 otherwise 

(and later its limit as k — » 0). This function represents, for instance, a force of magnitude 
l/k acting from r = a to t = a + k, where k is positive and small. In mechanics, the 
integral of a force acting over a time interval a ^ t ^ a + k is called the impulse of the 
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force; similarly for electromotive forces E(f) acting on circuits. Since the blue rectangle 
in Fig. 130 has area 1, the impulse of f k in (1) is 


( 2 ) 


,a+k 


4 = J fk(t ~a)dt = \ j-dt= 1. 


To find out what will happen if k becomes smaller and smaller, we take the limit of f k 
as k — » 0 (k > 0). This limit is denoted by 8(t — a), that is, 

8(t - a) = lim f k (t - a). 

kr-> 0 

8(t — a) is called the Dirac delta function 2 or the unit impulse function. 

8{t — a) is not a function in the ordinary sense as used in calculus, but a so-called 
generalized function . 2 To see this, we note that the impulse I k of f k is 1, so that from (1) 
and (2) by taking the limit as k — > 0 we obtain 


(3) 


8(t — a) 


cc 

.0 


if t = a 
otherwise 


and 


f 8(t - a) dt = 1, 
J o 


but from calculus we know that a function which is everywhere 0 except at a single point 
must have the integral equal to 0. Nevertheless, in impulse problems it is convenient to 
operate on 8(t — a) as though it were an ordinary function. In particular, for a continuous 
function g(t) one uses the property [often called the sifting property of 8(1 — a), not to 
be confused with shifting] 


(4) 


[ g(t)80 - a) dt = g(a) 
J o 


which is plausible by (2). 

To obtain the Laplace transform of 8(t — a), we write 

fk(t ~ a) = “ [w(f - a) - u{t - ( a + k))] 


r 


„ Area = 1 

l/k 

\ 




a a + k t 


Fig. 130. The function f k [t — a) in (1) 


2 PAUL DIRAC (1902-1984), English physicist, was awarded the Nobel Prize [jointly with the Austrian 
ERWIN SCHRODINGER (1887-1961)] in 1933 for his work in quantum mechanics. 

Generalized functions are also called distributions. Their theory was created in 1936 by the Russian 
mathematician SERGEI L’VOVICH SOBOLEV (1908-1989). and in 1945. under wider aspects, by the French 
mathematician LAURENT SCHWARTZ (1915-2002). 
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EXAMPLE 1 


and take the transform [see (2)] 


%{f k (t - «)} - t- t« _as ~ e~ ia+k)s ] = — . 

KS /C S 

We now take the limit as k — > 0. By 1’HdpitaTs rule the quotient on the right has the limit 
1 (differentiate the numerator and the denominator separately with respect to k, obtaining 
se~ 1cs and s , respectively, and use se^/s — » 1 as k 0). Hence the right side has the 
limit e~ as . This suggests defining the transform of 8(f — a) by this limit, that is, 

(5) 5£{S(r-a)} =*-<*. 


The unit step and unit impulse functions can now be used on the right side of ODEs 
modeling mechanical or electrical systems, as we illustrate next. 

Mass-Spring System Under a Square Wave 

Determine the response of the damped mass-spring system (see Sec. 2.8) under a square wave, modeled by (see 
Fig. 131) 

y" + 3/ + 2y = r(t) = u(t - l) - «(/ - 2), v(0) = 0, y'(0) = 0. 


Solution. From (1) and (2) in Sec. 6.2 and (2) and (4) in this section we obtain the subsidiary equation 

s z Y + 3rK + 2Y = - (e~ s - « -2s ). Solution F(r) = — ; — \ — (e~* - e -2 *). 

s s(s 4 3$ 4 2) 

Using the notation F(s) and partial fractions, we obtain 

J 1 U2 l_ 1/2 

^ ^ s{s 2 4- 3 s 4 2) s(s 4 I)(j 4 2) s s 4 1 5 4 2 

From Table 6.1 in Sec. 6.1, we see that the inverse is 


m = 2-Hn = l-e~ t + \e~ zt . 


Therefore, by Theorem 1 in Sec. 6.3 (/-shirting) we obtain the square-wave response shown in Rg. 131, 


y = S£~\F(s)e~ s - F(.v)e -2s ) 

= f(t - 1 )«(r -!)-/«- 2)u(t - 2) 

'0 

1 -*-»-» + J*-**-® 

-«-«-!> + e -«" 2 > + I*" 2 ""” - } e - 2(t ~ 2) 


(0 < t < 1 ) 

(1 < t < 2 ) 

(/ > 2 ). ■ 


yti) 

1 


0.5 f— 



t 


Fig. 131. Square wave and response in Example 1 



244 


CHAP. 6 Laplace Transforms 


EXAMPLE 2 


EXAMPLE 3 


Hammerblow Response of a Mass-Spring System 

Find the response of the system in Example 1 with the square wave replaced by a unit impulse at time 
t = 1. 

Solution . We now have the ODE and the subsidiary equation 


y" + 3y' + 2y = 80 - 1). 


and ( s 2 + 3s + 2)Y = 


Solving algebraically gives 


m = 


(s + 1 )(s + 2) 


= (— u 

\ .9 + 1 s+ 2/ 


By Theorem 1 the inverse is 


yO) = XT\Y) = 


0 




if 0 < r < 1 
if t > I. 


yO) is shown in Fig. 132. Can you imagine how Fig. 131 approaches Fig. 132 as the wave becomes shorter and 
shorter, the area of the rectangle remaining 1 ? ■ 



Fig. 132. Response to a hammerblow in Example 2 


Four-T erminal RIC-Network 

Find the output voltage response in Fig. 133 if R = 20 H, L — 1 H, C = 10“ 4 F, the input is 80) (a unit impulse 
at time t - 0), and current and charge are zero at time / = 0. 

Solution . To understand what is going on. note that the network is an RLC-circuit to which two wires at A 
and B are attached for recording the voltage v0) on the capacitor. Recalling from Sec. 2.9 that current /(/) and 
charge q0) are related by / = q = dq/dt, we obtain the model 

Li' + Ri + = Lq" + Rq' + = q" + 20 q + 10000 9 = S(/). 

From (1) and (2) in Sec. 6.2 and (5) in this section we obtain the subsidiary equation for 0 (j) = %(q) 

9 1 

+ 20 s + 10 000)0 = 1. Solution Q = 5 . 

(s + I0) 2 + 9900 

By the first shifting theorem in Sec. 6.1 we obtain from Q damped oscillations for q and w, rounding 
9900 = 99.50 2 . we get (Fig. 133) 

q = X~\Q) = e~ 10t sin 99.50/ and v = ^ = 100.5e _lot sin 99.50/. ■ 
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EXAMPLE 4 




Fig. 133. Network and output voltage in Example 3 


More on Partial Fractions 

We have seen that the solution Y of a subsidiary equation usually appears as a quotient 
of polynomials Y(s) = F(s)/G(s ), so that a partial fraction representation leads to a 
sum of expressions whose inverses we can obtain from a table, aided by the first 
shifting theorem (Sec. 6.1). These representations are sometimes called Heaviside 
expansions. 

An unrepeated factor s — a in G{s) requires a single partial fraction AJ(s — a). See 
Examples 1 and 2 on pp. 243, 244. Repeated real factors (s — a ) 2 , (s — a ) 3 , etc., require 
partial fractions 

+ Al A 3 , Az A ! 

(s — a) 2 s — a (s — a) 3 (s — a) 2 s — a 

The inverses are (A 2 t + A x )e at , (\A 2 t 2 + A 2 t + A t )e at , etc. 

Unrepeated complex factors (s — a)(s — a), a = a + ifi, d = a — ifr require a partial 
fraction (As + B)/[(s — a) 2 + /3 2 ]. For an application, see Example 4 in Sec. 6.3. 
A further one is the following. 

Unrepeated Complex Factors. Damped Forced Vibrations 

Solve the initial value problem for a damped mass-spring system acted upon by a sinusoidal force for some 
time interval (Fig. 134), 

y" + 2y + 2y = r(f), r(/) = 10 sin 2r if 0 < t < ir and 0 if / > ir; y(0) = 1, y (0) = —5. 

Solution . From Table 6.1, (1), (2) in Sec. 6.2, and the second shifting theorem in Sec. 6.3, we obtain the 
subsidiary equation 

o 

(i 2 r - * + 5) + 2 (sY - I) + 2Y = 10 (1 - e - ”®). 

j + 4 

We collect the T-terras, {s 2 + 2s + 2)Y, take — s -f 5 - 2 = — s + 3 to the right, and solve, 

20 20e~” s s- 3 

(6) Y ~ (s 2 + 4)0 2 + 2s + 2) (s 2 + 4)(f 2 + 2s + 2) + s 2 + 2s + 2 ' 

For the last fraction we get from Table 6.1 and the first shifting theorem 


( 7 ) 


. f j + 1 - 4 I 

x U+ o 2 + 1 l -e (c ° st_4sin,) - 
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In the first fraction in (6) we have unrepeated complex roots, hence a partial fraction representation 

20 At + B Ms + N 

($ 2 + 4)($ 2 + 2s + 2) ~~ a * 2 + 4 a - 2 + 2s + 2 ’ 

Multiplication by the common denominator gives 

20 = (At + B)(s 2 + 2s + 2) + (Ms + N)(s 2 + 4). 

We determine A, 5, M, N. Equating the coefficients of each power of s on both sides gives the four equations 

(a) Is 3 ]: 0 = A + M (b) U 2 ]: 0 = 24 + B + N 

(c) [s]: 0 = 2A + 2B + AM (d) [a 0 ]: 20 = 2B + AN. 

We can solve this, for instance, obtaining M - -A from (a), then A = B from (c), then N - -3A from (b), 
and finally A = -2 from (d). Hence A = -2, B - -2, M = 2, N = 6, and the first fraction in (6) has the 
representation 

-2s - 2 2(s + 1) + 6 - 2 

(8) — « + 5 . Inverse transform: -2 cos 2f - sin 2r + e (2 cos / + 4 sin t). 

s 2 + 4 (f + l) 2 + I 

The sum of this and (7) is the solution of the problem for 0 < t < tt. namely (the sines cancel). 

(9) y(t) = 3<? -t cos / - 2 cos 2/ - sin 2/ if 0 < / < tt. 

In the second fraction in (6) taken with the minus sign we have the factor e~ 7TS , so that from (8) and the second 
shifting theorem (Sec. 6.3) we get the inverse transform 

+2 cos (2/ - 27 t) + sin (2 1 - 2t t) - e _a_ir) |2 cos (r - tt) 4* 4 sin ( t - tt)\ 

= 2 cos 2 1 + sin 2 1 + e _a_ir) (2 cos / + 4 sin /). 

The sum of this and (9) is the solution for i > tt, 

(10) y(/) = e~ l 1(3 + 2e 7r ) cos / + 4^ sin /] if r > tt. 

Figure 134 shows (9) (for 0 < / < if) and (10) (for t > 7r), a beginning vibration, which goes to zero rapidly 
because of the damping and the absence of a driving force after t - tt. ■ 


Driving force 



Fig. 134. 



Example 4 


Output (solution) 


The case of repeated complex factors [(s — a)(s — a)] 2 , which is important in connection 
with resonance, will be handled by “convolution” in the next section. 
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PROBLEM SET 6.4 


1 1—12 1 EFFECT OF DELTA FUNCTION ON 
VIBRATING SYSTEMS 

Showing the details, find, graph, and discuss the solution. 

1. v" + v = so - 2ir), v(0) = 10, 

v'(0) = 0 

2. y" + 2/ + 2y = e~‘ + 5 80 - 2), 

y(0) = 0. y'(0) = I 

3. y" - y = 10S(/ - £) - I005(/ - 1), 

y(0) = 10. v'(0) = I 

4. y" + 3y' + 2y = 10(sin t + 80 ~ D), 

y(0) = 1. y'(0) = — I 

5. y" + 4y' + 5y = [l - u0 - I0)]e' - e 10 SO ~ 10), 

y(0) = 0, y'(0) = I 

6. y" + 2y' - 3y = 100S(/ - 2) + 100S(/ - 3), 

y(0) = I, y'(0) = 0 

7. y" + 2y' + lOy = I0[l - uO ~ 4)j - I05(/ - 5), 

y(0) =1, y'(0) = I 

8. y" + 5 y' + 6 y = 80 ~ & ir) + uO — it) cos /. 

y(0) = 0, ' y'(0) = 0 

9. v" + 2y' + 5y = 25/ - 1006(/ - i r), 

y(0) = -2. y'(0) = 5 

10. y" + 5y = 25/ - I00S(/ - n). y(0) = -2, 

y'(0) = 5. (Compare with Prob. 9.) 

11. v" + 3y' - 4v = 2<? ! - 8 e 2 S0 - 2). 

v(0) = 2, y'(0) = 0 

12. y" + y = — 2 sin / + 105 (/ — it), y(0) = 0, 

y'(0) = l 


for the differential equation, involving k , take specific 
k's from the beginning. 

(b) Experiment on the response of the ODE in 
Example 1 (or of another ODE of your choice) to an 
impulse 6(/ — a) for various systematically chosen a 
(> 0); choose initial conditions y(0) £ 0, y'(0) = 0. 
Also consider the solution if no impulse is applied. Is 
there a dependence of the response on «? On b if you 
choose b8(t - a)l Would -8(t - a) with a > a 
annihilate the effect of 8(t - a)l Can you think of 
other questions that one could consider 
experimentally by inspecting graphs? 

15. PROJECT. Heaviside Formulas, (a) Show that for a 
simple root a and fraction Af(s — a) in F(s)/G(s) we 
have the Heaviside formula 


A — lim 

s — a 


(,v - a)FQ) 
GO) 


(b) Similarly, show that for a root a of order m and 
fractions in 


f(/) ^ ^m— 1 j 

GO) O ~ a) m O- a)"‘~ l 

H F further fractions 

s — a 


we have the Heaviside formulas for the first coefficient 


A m = lim 

s-*a 


(£ - arm 

G{S) 


13. CAS PROJECT. Effect of Damping. Consider a 
vibrating system of your choice modeled by 

y" + cy' + ky = r(/) 

with r(t) involving a 6-function, (a) Using graphs of 
the solution, describe the effect of continuously 
decreasing the damping to 0, keeping k constant. 

(b) What happens if c is kept constant and k is 
continuously increased, starting from 0? 

(c) Extend your results to a system with two 
6-functions on the right, acting at different times. 

14. CAS PROJECT. Limit of a Rectangular Wave. 
Effects of Impulse. 

(a) In Example 1, take a rectangular wave of area 1 
from 1 to 1 + k. Graph the responses for a sequence 
of values of k approaching zero, illustrating that for 
smaller and smaller k those curves approach the curve 
shown in Fig. 132. Hint: If your CAS gives no solution 


and for the other coefficients 

„ 1 ,. d m ~ k f (,v ~ a) m FO) 1 

Ak (m - k)\ di m ~ k l GO) J ’ 

k - I, • • • ,m - I. 

16. TEAM PROJECT. Laplace Transform of Periodic 
Functions 

(a) Theorem. The Laplace transform of a piecewise 
continuous function f(t) with period p is 

1 f p 

(ID W) = , _ - p , I e~ s, f0) dt O > 0). 

l e Jq 

Prove this theorem. Hint: Write f^= J p + + . . . 

Set / = (/#— I )p in the /tth integral. Take out 
from under the integral sign. Use the sum formula for 
the geometric series. 
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(b) Half-wave rectifier. Using (11), show that the 
half-wave rectification of sin wt in Fig. 135 has the 
Laplace transform 


, ( 0(1 + e~ vsla ) 

m) ~ (* 2 + o, 2 )(l - e - 2 ”°'n 

<o 

~ (s 2 + w 2 )(l - e-”*'") ' 

(A half-wave rectifier clips the negative portions of the 
curve. A full-wave rectifier converts them to positive; 
see Fig. 136.) 


fit) 



0 k/(o 2k!(o 3 kI(o t 


Fig. 135. Half-wave rectification 



(c) Full-wave rectifier. Show that the Laplace 
transform of the full-wave rectification of sin wt is 


co 

s 2 + w 2 


7 TS 

coth - — . 
2co 


(d) Saw-tooth wave. Find the Laplace transform of 
the saw-tooth wave in Fig. 137. 



(e) Staircase function. Find the Laplace transform of 
the staircase function in Fig. 138 by noting that it is 
the difference of ktfp and the function in (d). 


fit) 
k - 


0 


r 

I 

_L 

P 


I 


I 

J 


2 P 


3 p 


t 


Fig. 136. Full-wave rectification 


Fig. 138. Staircase function 


6.5 Convolution. Integral Equations 

Convolution has to do with the multiplication of transforms. The situation is as follows. 
Addition of transforms provides no problem; we know that ££(/ + g) = ££(/) + ££(g). 
Now multiplication of transforms occurs frequently in connection with ODEs, integral 
equations, and elsewhere. Then we usually know X(J) and £(g) and would like to know 
the function whose transform is the product X(f)X(g), We might perhaps guess that it is 
fg, but this is false. The transform of a product is generally different ft m om the product of 
the transforms of the factors , 


£(fg) * $£(f)£(g) in general. 

To see this take f = e l and g = 1. Then fg = e\ !£(fg) = l/(y — 1), but ££(/) = l/(s - 1) 
and %(1) = 1/j give £(f)£(g) = l/(s 2 - s ). 

According to the next theorem, the correct answer is that $£(f)$£(g) is the transform of 
the convolution of / and g, denoted by the standard notation f * g and defined by the 
integral 


Kt) = (/ * g)(t) = f f(r)g(t - r) dr. 


a) 
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THEOREM 1 


EXAMPLE 1 


EXAMPLE 2 


PROOF 


Convolution Theorem 

If two functions f and g satisfy the assumption in the existence theorem in Sec . 6.1, 
so that their transforms F and G exist, the product H = FG is the transform of h 
given by (1). (Proof after Example 2.) 


Convolution 

Lei H(s) = l/[(5 - a)s]. Find h{t). 

Solution . 1 f{s - a) has the inverse f(t) = e at , and l/$ has the inverse g(t) = 1. With /(r) = e ar and 
g(t — r) = 1 we thus obtain from (1) the answer 

h(t) = e at * l = I e ar - 1 dr = - (e ai - 1). 

J Q a 


To check, calculate 


H(s) = XWS) = - 
a 



t) 




- = 2(e°‘) 2(1). ■ 

J? 


Convolution 

Let //(j) = l/(s 2 + <o 2 ) 2 . Find /?(/). 

Solution. The inverse of l/(s 2 + o?) is (sin a>t)l(o. Hence from (1) and the trigonometric formula (1 1) in 
App. 3.1 with x = + (ot) and y = — wr) we obtain 


sin o)l sin <ot 
h{t) = * 


sin o)7 sin <o(t - t) dr 


= “T [ sin 
cu * / o 

I f* 

= — s' I [~cos w/ + cos cjt] dr 

2o) •» q 

1 P sinwr"!* 

= — 5 - -7 coso)/ -4 

2«> 2 L " Jr- 

If sin (ot 1 

= — o -f cos on + 

2or L v J 


in agreement with formula 21 in the table in Sec. 


6.9. 


We prove the Convolution Theorem 1. CAUTION! Note which ones are the variables 
of integration! We can denote them as we want, for instance, by t and p, and write 

OO 00 

F(s) = f e~ s J(r) dr and G{s) = f e~*g{p) dp. 

J o J o 

We now set t = p + r, where r is at first constant. Then p = t — t, and t varies from r 
to °°. Thus 
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t in F and t in G vary independently. Hence we can insert the G-integral into the 
F-integral. Cancellation of e~ ST and e ST then gives 

00 00 

e~ ST f(T)e ST f e~ st g(t - r) dt cIt = f f( t) 
j t J o 


F(s)G(s) = / 

J o 


f 


e st g(t — t) dt dr . 


Here we integrate for fixed r over t from r to co and then over r from 0 to oo. This is die 
blue region in Fig. 139. Under the assumption on / and g the order of integration can be 
reversed (see Ref. [A5] for a proof using uniform convergence). We then integrate first 
over r from 0 to t and then over t from 0 to that is, 

F(s)G(s) = f e~* [ f(r)g(t - r)drdt= f e~*h(t) dt = £(h) = H(s ). 

J o J o J o 

This completes the proof. ■ 



Fig. 139. Region of integration in the 
tr-plane in the proof of Theorem 1 


From the definition it follows almost immediately that convolution has the properties 

f * g = 8 * f (commutative law) 

/ 955 tei + £ 2 ) = / * gi + / * £2 (distributive law) 

(/ * g) * v = / * (g * v) (associative law) 

/* 0 = 0*/=0 

similar to those of the multiplication of numbers. Unusual are the following two properties. 

Unusual Properties of Convolution 

/ * 1 # / in general. For instance, 

t* 1 = I r* 1 dr = h 2 t. 

J o 


(/ * /)(*) ^ 0 may not hold. For instance, Example 2 with a> = 1 gives 

sin / * sin / = — | / cos / + | sin t 


(Fig. 140). ■ 
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EXAMPLE 4 



Fig. 140. Example 3 


We shall now take up the case of a complex double root (left aside in the last section in 
connection with partial fractions) and find the solution (the inverse transform) directly by 
convolution. 

Repeated Complex Factors. Resonance 

In an undamped mass-spring system, resonance occurs if the frequency of the driving force equals the natural 
frequency of the system. Then the model is (see Sec. 2,8) 

y" + (o 0 2 y = K sin co 0 i 

where a> 0 2 = klm> k is the spring constant, and m is the mass of the body attached to the spring. We assume 
y(0) = 0 and y'( 0) = 0, for simplicity. Then the subsidiary equation is 

o o Kv 0 Ka> 0 

s Y + w o Y — ~2— 2 * Ils solution is Y — 75— 53 • 

S + 6 )q ( S + <Oq ) 

This is a transform as in Example 2 with <a = co 0 and multiplied by Ko) 0 . Hence from Example 2 we can see 
directly that the solution of our problem is 

Kojq ( sin a) 0 t \ K 

v(0 = o I cos Mo? I = 2 (~to 0 tcos a> 0 t + sin o> 0 t). 

2 (o 0 \ to 0 / 2(o 0 

We see that the first term grows without bound. Clearly, in the case of resonance such a term must occur. (See 
also a similar kind of solution in Fig. 54 in Sec. 2.8.) H 


Application to Nonhomogeneous Linear ODEs 

Nonhomogeneous linear ODEs can now be solved by a general method based on 
convolution by which the solution is obtained in the form of an integral. To see this, recall 
from Sec. 6.2 that the subsidiary equation of the ODE 

(2) y" -1- ay + by = r{l) ( a , b constant) 

has the solution [(7) in Sec. 6.2] 

Y(s) = [(. j + a)y(0) + /(0)]<2(s) + R(s)Q(s) 

with R(s) = $£(r) and Q(s) = 1 f(s 2 4- as 4- b) the transfer function. Inversion of the first 
term [• • •] provides no difficulty; depending on whether \a 2 — b is positive, zero, or 
negative, its inverse will be a linear combination of two exponential functions, or of the 
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form ( Ci + c 2 t)e~ atl2 , or a damped oscillation, respectively. The interesting term is 
R(s)Q(s) because r{t) can have various forms of practical importance, as we shall see. If 
y(0) = 0 and y'(0) = 0, then Y = RQ, and the convolution theorem gives the solution 

(3) y(t) = \ q(t - r)r(r)dr. 

J o 


Response of a Damped Vibrating System to a Single Square Wave 

Using convolution, determine the response of the damped mass-spring system modeled by 

y" 4- 3y f + 2 y = /*(/), r(t) = 1 if 1 < / < 2 and 0 otherwise, y(0) = y'( 0) = 0. 


This system with an input (a driving force) that acts for some time only (Fig. 141) has been solved by partial 
fraction reduction in Sec. 6.4 (Example 1 ). 

Solution by Convolution. The transfer function and its inverse are 
1 111 


Q(s) = 


+ 35 + 2 (5 + l)(5 + 2) 5+1 5 + 2 


hence 


q{t) = e 1 - e 


Hence the convolution integral (3) is (except for the limits of integration) 

y(t) = Jg(t — r) • I dr = dr = e " w “ T) - 

Now comes an important point in handling convolution. r(r) = 1 if 1 < r < 2 only. Hence if /< 1, the integral 
is zero. Tf 1 < / < 2, we have to integrate from t = 1 (not 0) to t. This gives (with the first two terms from the 
upper limit) 

y«) = e~° - £e ~ 0 - (e"'*"” - l*-* 4 " 11 ) = \ - e~ a ~" + \e~ 2a ~ l \ 

If t > 2, we have to integrate from r = I to 2 (not to /). This gives 
y{t) = - £e- 2(£ - 2> - (•"»-« - 

Figure 141 shows the input (the square wave) and the interesting output, which is zero from 0 to 1, then increases, 
reaches a maximum (near 2.6) after the input has become zero (why?), and finally decreases to zero in a monotone 
fashion. M 


y(t) 

1 - 



Fig. 141. Square wave and response in Example 5 

Integral Equations 

Convolution also helps in solving certain integral equations, that is, equations in which 
the unknown function y(t) appears in an integral (and perhaps also outside of it). This 
concerns equations with an integral of the form of a convolution. Hence these are special 
and it suffices to explain the idea in terms of two examples and add a few problems in 
the problem set. 
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EXAMPLE 6 A Volterra Integral Equation of the Second Kind 

Solve the Volterra integral equation of the second kind 3 


yit) - y(r) sin ( t - r) dr = t. 

•'o 

Solution . From (1 ) we see that the given equation can be written as a convolution, y — y* sin i = /. Writing 
Y = $£(y) and applying the convolution theorem, we obtain 


The solution is 

Y(s) 


1 1 
w - w -2—7 = w 7 - -2 • 

s + 1 s + 1 s 

s 2 + I I I t z 

4 — = — + — and gives the answer y(t) = t + — 


s s s o 

Check the result by a CAS or by substitution and repeated integration by parts (which will need patience). 


EXAMPLE 7 Another Volterra Integral Equation of the Second Kind 

Solve the Volterra integral equation 

v(r) —1(1 + r) y(t — t) dr = 1 — sinh /. 

J o 

Solution . By (1) we can write y — (I + t) *y = 1 - sinh t. Writing Y = f£(y), we obtain by using the 
convolution theorem and then taking common denominators 


m [‘- (j ♦•?)]- 7 - +7 • «■ 

{s 2 - s — l)/s cancels on both sides, so that solving for Y simply gives 


A' 2 S 1 S 2 — 1 — S 

s 2 s(s 2 — 1) 


Y(s) = 

s ~ 1 


and the solution is y(/) = cosh /. 


PROBLEM S E T S= 5 E 


1 1-8 1 CONVOLUTIONS BY INTEGRATION 

Find by integration: 

1. 1 * 1 2 . / * r 

3. f * e' 4. e ot * e bt (a # Z>) 

5* 1 * cos a# 6. 1 * f(t) 

7. e fce * e~ kt 8. sin t * cos t 

9-16 1 INVERSE TRANSFORMS 
BY CONVOLUTION 

Find /(f) if %(f) equals: 


(s - 3XJ + 5) 
1 

s(s 2 4- 4) 


s 2 (s 2 + 1) 
1 

.v(^ 2 - 9) 


(s 2 + 16) 2 


(5 2 + 1 )(s 2 + 25) 


10. 

1 

18. y 


s(s - 1) 

19. y 

12. 

1 

20. y 


s\s - 2) 

y 


17. (Partial fractions) Solve Probs. 9, 1 1, and 13 by using 
partial fractions. Comment on the amount of work. 

18-25] SOLVING INITIAL VALUE PROBLEMS 

Using the convolution theorem, solve: 

18. y" + y = sin f, y(0) = 0, y'( 0) = 0 

19. y" + 4y = sin 3f, y(0) = 0, y'(0) = 0 

20. y" + 5 y' + Ay = 2*“ 2t , y( 0) = 0, 


3 If the upper limit of integration is variable, the equation is named after the Italian mathematician VTTO 
VOLTERRA (1860-1940), and if that limit is constant, the equation is named after the Swedish mathematician 
IVAR FREDHOLM (1866-1927). “Of the second kind (first kind)” indicates that y occurs (does not occur) 
outside of the integral. 
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21. v" + 9y = 8 sin / if 0 < / < it and 0 if t > tt; 

y(0) = 0. v'(0) = 4 

22. y ” + 3y' + 2y = 1 if 0 < I < a and 0 if I > a; 

y(0) = 0, y'(0) = 0 

23. y" + 4y = 5 u(t - 1); y(0) = 0, y'(0) = 0 

24. y" + 5y' + 6v = S(t - 3); v(0) = 1, 

y'(0) = 0 

25. v" + 6v' + By = 2S(t - I) + 2 8(r - 2): 
y(0) = I, y'( 0) = 0 


26. TEAM PROJECT. Properties of Convolution. 

Prove: 

(a) Commutativity, / * g = g * f 

(b) Associativity. (/ * g) * v = / * (g * v) 

(c) Distributivity, / * (g 1 4 g 2 ) = / * Si + / * g 2 

(d) Dirac’s delta. Derive the sifting formula (4) in 
Sec. 6.4 by using f k with a = 0 [(1), Sec. 6.41 and 
applying the mean value theorem for integrals. 

(e) Unspecified driving force. Show that forced 
vibrations governed by 

y" + a > 2 v = r(/). y(0) = K t . y'(0) = K 2 

with (o =£ 0 and an unspecified driving force ;•(/) can 
be written in convolution form. 




1 

— sin (of * /*(/) -h K x cos o)t + 
<o 


K 2 . 

— sin u)t. 
(0 


27-34 1 INTEGRAL EQUATIONS 

Using Laplace transforms and showing the details, solve: 

27. y(t) - f y(r) dr = I 

J o 

28. y(t) 4- I* y(r) cosh (/ — r) dr = t 4 e l 

J o 

29. y(t) — I y ( r) sin (/ - t) dr = cos t 

J o ' 

30. y(r) -1- 2 I y(r) cos ( t - r) dr = cos t 

J o 

31. v(/ ) 4 f (t - t)v(t) dr = I 

J o 

32. y(;) - [ y(r)(/ - r) dr = 2 - £/ 2 

J o 

33. y(/) 4 2e r f e~ r y( r) dr = /<?* 

J 0 

34. y(/) 4 f e 2a ~ r) v(t) ^/t = / 2 ~ “ I 4 

J o 

35. CAS EXPERIMENT. Variation of a Parameter. 

(a) Replace 2 in Prob. 33 by a parameter k and 
investigate graphically how the solution curve changes 
if you vary L in particular near k = — 2. 

(b) Make similar experiments with an integral 
equation of your choice whose solution is oscillating. 


6.6 Differentiation and Integration of Transforms. 
ODEs with Variable Coefficients 

The variety of methods for obtaining transforms and inverse transforms and their 
application in solving ODEs is surprisingly large. We have seen that they include direct 
integration, the use of linearity (Sec. 6.1), shifting (Secs. 6.1, 6.3), convolution (Sec. 6.5), 
and differentiation and integration of functions /(/) (Sec. 6.2). But this is not all. In this 
section we shall consider operations of somewhat lesser importance, namely, 
differentiation and integration of transforms F(s) and corresponding operations for 
functions /(/), with applications to ODEs with variable coefficients. 

Differentiation of Transforms 

It can be shown that if a function f(t) satisfies the conditions of the existence theorem in 
Sec. 6.1, then the derivative F'(s) = dFIcls of the transform F(s ) = f£(/) can be obtained 
by differentiating F(s) under the integral sign with respect to s (proof in Ref. [GR4] listed 
in App. 1). Thus, if 

F(s) = f e~ st f(t) dt, then F\s) = - f e~ st tf(t)dt. 

J o J o 
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EXAMPLE 1 


Consequently, if $£(f) = F(s), then 

(1) <£{tf(r)} = -F\s), hence X" l {F f (s)} = —//CO 

where the second formula is obtained by applying on both sides of the first formula. 
In this way, differentiation of the transform of a function corresponds to the multiplication 
of the function by —t. 

Differentiation of Transforms. Formulas 21-23 in Sec. 6.9 

We shall derive the following three formulas. 



Solution. From (l) and formula 8 (with <o = /3) in Table 6.1 of Sec. 6.1 we obtain by differentiation 
(CAUTION! Chain rule!) 


£(t sin (3 1) 


2 flv 

(s 2 + /S 2 ) 2 


Dividing by 2/3 and using the linearity of % we obtain (3). 

Formulas (2) and (4) are obtained as follows. From (1) and formula 7 (with cj = j 8) in Table 6.1 we find 


(5) 


££(/ cos fit) = — 


(, s 2 + Z3 2 ) - 2s 2 


(s 2 + f?? 

From this and formula 8 (with w = j 3) in Table 6. 1 we have 


(S 2 + /S 2 ) 2 ' 


/ i \ - r i 

^rcos fi,*- sin fi,) = (y2 + ^2 ± 77^ • 


On die right we now take the common denominator. Then we see that for the plus sign the numerator becomes 
s 2 — jS 2 + s 2 + (P — 2 .v 2 , so that (4) follows by division by 2. Similarly, for the minus sign the numerator 
takes the form s 2 - fP - s 2 - Z3 2 = -2fP, and we obtain (2). This agrees with Example 2 in Sec. 6.5. ■ 


Integration of Transforms 

Similarly, if f(t) satisfies the conditions of the existence theorem in Sec. 6.1 and the limit 
of fit) it, as t approaches 0 from the right, exists, then for s > k, 

(6) | F(s)ds hence X~ x - 



In this way, integration of the transform of a function fit) corresponds to the division of 
by t. 
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We indicate how (6) is obtained. From the definition it follows that 


f F(s) ds = 

J s 



ds. 


and it can be shown (see Ref. [GR4] in App. 1) that under the above assumptions we may 
reverse the order of integration, that is, 


J OO OOP OO ”1 OO r 00 ” 

F(s) ds = I I e~ St f(t) ds dt = \ f(t) \ e~ H ds 

J o L J s J J o L J s 


dt. 


Integration of e~ st with respect to s gives e H /(-t). Here the integral over s on the right 
equals e~ st /t. Therefore, 


fV-^ = 

J .* J o t 




m 


(s > k). 


Differentiation and Integration of Transforms 

t ti? \ i 2 + w 2 

Find the inverse transform of In 1 1 H g" 1=1° — p — • 

Solution . Denote the given transform by F(s). Its derivative is 

F'(.) = ± (ln (, 2 + «, 2 ) - l", 2 ) = - J . 

Taking the inverse transform and using (1), we obtain 

} = X~ l \ ~2~~2 - -} = 2 COS <ot - 2 = -//(f). 

[ S + (O s ) 

Hence the inverse fit) of F(s) is fit) = 2(1 - cos <ot)lt. This agrees with formula 42 in Sec. 6.9. 
Alternatively, if we let 


2s 2 

G( s ) ~ 2 , 2 ~ 

$ + C*> s 


then 


g(» = X~Hg) = 2(cos at - 1). 


From this and (6) we get, in agreement with the answer just obtained, 

j 2 + a; 2 r 

T--J. 


G(s) ds = - = j (1 - cos o>l). 


the minus occurring since s is the lower limit of integration. 
In a similar way we obtain formula 43 in Sec. 6.9, 


X 1 jin - ^ 2 -j| = J (1 - cosb at). 


Special Linear ODEs with Variable Coefficients 

Formula (1) can be used to solve certain ODEs with variable coefficients. The idea is this. 
Let X(y) = Y. Then Z(y') = sY - y(0) (see Sec. 6.2). Hence by (1), 

, d dY 

W) = --r[sY- y(0)] = -Y-s — . 


( 7 ) 
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Similarly, £(y") = s 2 Y — sy( 0) - /(0) and by (1) 

d dY 

(8) %(ty") = - — [s 2 Y - sy(0) - /(0)] = -2 sY - s 2 — + y(0). 


Hence if an ODE has coefficients such as at + b, the subsidiary equation is a first-order ODE 
for y, which is sometimes simpler than the given second-order ODE But if the latter has 
coefficients at 2 + bt + c, then two applications of (1) would give a second-order ODE for 
y, and this shows that the present method works well only for rather special ODEs with variable 
coefficients. An important ODE for which the method is advantageous is the following. 


EXAMPLE 3 Laguerre’s Equation. Laguerre Polynomials 

Laguerre’s ODE is 

(9) ty” + (1 — t)y + ny = 0. 

Wc determine a solution of (9) with n — 0. I, 2, • • ■ . From (7)-(9) we get the subsidiary equation 


-2 sY - s 2 — + ,v(0) + sY- y(0) - I -T - s— 1 + nY = 0. 


Simplification gives 


2 dY 

(i - s 2 ) — + (« + 1 - s)Y = 0. 
as 


Separating variables, using partial fractions, integrating (with the constant of integration taken zero), and taking 
exponentials, we get 


oo*) - 


dY_ 

Y 


n + 1 - s 


—fa -'-?■) 


ds 


and 


Y = 


is - \) n 


We write l n = ££ 1 (J / ) and prove Rodrigues’s formula 


( 10 ) 


/n — 




n = 1, 2, • 


These are polynomials because the exponential terms cancel if we perform the indicated differentiations. They 
are called Laguerre polynomials and are usually denoted by L n (see Problem Set 5.7, but we continue to reserve 
capital letters for transforms). We prove (10). By Table 6.1 and the first shifting theorem (^-shifting), 

WV _t ) = /','n-n ’ hence b y (3) in Sec. 6.2 & [ (»“«”*) 

K$ + 1) l «/ 

because the derivatives up to the order n - 1 are zero at 0. Now make another shift and divide by n! to get [see 
(10) and then (10*)) 

(s “ l) n . 

= -^n+i" = Y- ■ 


n!i 


(s + l) n+l 





1 1-12 1 TRANSFORMS BY DIFFERENTIATION 

Showing the details of your work, find ££(/) if /(/) equals: 
1. 4 te l 2 . — / cosh 2 1 

3. t sin <ot 4. t cos ( r + k) 


5. te~ 2t sin / 

7. t 2 sinh 4 1 
9. t 2 sin o)i 
11. t sin (r + k) 


6. t 2 sin 3 1 
8. t n e kt 
10. t cos <oi 
12. te~ kt sin / 
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13-20 1 INVERSE TRANSFORMS 

Using differentiation, integration, 5-shifting, or convolution 
(and showing the details), find /(/) if ££(/) equals: 

6 s 

13 ‘ (s 4 l) 2 14 * (.V 2 + 1 6) 2 


15. 


17. 


2 (s 4 2) 

[(5 4 2) 2 4 l] 2 
2 

(s - kf 


19. In 

s - 1 


16. 


s 


18. In 


t** - \f 

s + a 


5 4/? 


20. arccot — 
0) 


21. WRITING PROJECT. Differentiation and 
Integration of Functions and Transforms. Make a 
short draft of these four operations from memory. Then 
compare your notes with the text and write a report of 
2-3 pages on these operations and their significance in 
applications. 

22. CAS PROJECT. Laguerre Polynomials, (a) Write a 
CAS program for finding l n (r) in explicit form from 
(10). Apply it to calculate / 0 , • • • , / 10 . Verify that / 0 , 
* • • , / 10 satisfy Laguerre’s differential equation (9). 


(b) Show that 


Lit) = 2 

m-0 


(-1T 

m! 





and calculate / 0 , • • * , l 1Q from this formula. 

(c) Calculate / 0 , • • • , / 10 recursively from / 0 = I, 
li = 1 - t by 


(;z 4 1 )/ w+1 = (2n 4 1 - t)l n - nl n ^. 


(d) Experiment with the graphs of / 0 , • • • , / 10 , finding 
out empirically how the first maximum, first minimum, 
* * • is moving with respect to its location as a function 
of n. Write a short report on this. 

(e) A generating function (definition in Problem Set 
5.3) for die Laguerre polynomials is 

30 

2 /»(/)*” = (1 - 
o 

Obtain / 0 , • * * , / 10 from the corresponding partial sum 
of this power series in a* and compare the I n with those 
in (a), (b), or (c). 


6 j Systems of ODEs 

The Laplace transform method may also be used for solving systems of ODEs, as we shall 
explain in terms of typical applications. We consider a first-order linear system with 
constant coefficients (as discussed in Sec. 4.1) 

yi = auyi + a 12 y 2 + gi(t) 

o> 

yz = aziyi + a 22 y 2 + g 2 (t). 

Writing F, = ^£(y x ), Y 2 = !£(y 2 ), G x = G 2 = £C(g 2 ), we obtain from (1) in 

Sec. 6.2 the subsidiary system 


— Vi(0) — a xl Yi + ci 12 Y z 4- Gj(s) 
sY 2 - y 2 ( 0) = a 2l Y x + a 22 Y 2 + G 2 (s). 


By collecting the Y and y 2 ‘t erms we have 

(«u - s)Y l + a X2 Y 2 = -y x (0) - G x (s) 

( 2 ) 

a 21^1 "b ( a 22 ~ 5)K 2 = — G 2 (s). 

By solving this system algebraically for Ki(s), Y 2 (s) and taking the inverse transform we 
obtain the solution y x = ^ _1 (Kj), y 2 = %~\Y 2 ) of the given system (1). 
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EXAMPLE 1 


Note that (1) and (2) may be written in vector form (and similarly for the systems in 
the examples); thus, setting y = [y x y 2 ] T , A = [a jk ], g = [gj g 2 ] T , Y = [Y x Y 2 ] r , 
G = [Gi G 2 ] t we have 

y' = Ay + g and (A - sI)Y = -y(0) - G. 

Mixing Problem Involving Two Tanks 

Tank 7\ in Fig. 142 contains initially 100 gal of pure water. Tank T 2 contains initially 100 gal of water in which 
150 lb of salt are dissolved. The inflow into Ti is 2 gal/min from T 2 and 6 gal/min containing 6 lb of salt from 
the outside. The inflow into T 2 is 8 gal/min from 7\. The outflow from T 2 is 2 + 6 = 8 gal/min, as shown in 
the figure. The mixtures are kept uniform by stirring. Find and plot the salt contents y^/) and y 2 (/) in Tj and 
T 2 , respectively. 

Solution, The model is obtained in the form of two equations 

Time rate of change = inflow/min - Outflow/min 

for the two tanks (see Sec. 4.1). Thus, 

, _ 8 2 , _ 8 8 

■ Vl 100 yi + 100 yz + 6> - Va 100 yi 100 >z ' 

The initial conditions arey JL (0) = 0, y 2 (0) = 150. From this we see that the subsidiary system (2) is 

6 

(—0.08 - s)Y x + 0.02 Y 2 = 

s 

0.08/! + (-0.08 - s)Y 2 = - 150. 

We solve this algebraically for Y\ and Y z by elimination (or by Cramer’s rule in Sec. 7.7), and we write the 
solutions in terms of partial fractions. 


V 

9s + 0.48 

100 

62.5 

37.5 

Y 1 - 

s(s + 0.12)(J + 0.04) ' 

s 

.v + 0.12 

s + 0.04 

r 2 = 

150i 2 + 12s + 0.48 

100 

125 

75 

s(s 4* 0. 1 2)(s + 0.04) 

s 

s + 0.12 

s 4- 0.04 


By taking the inverse transform we arrive at the solution 

= 100 - 62.5e~ o m - 37.5e" 0 04t 
v 2 = 100 + \25e~ 0 lzt - 75e~ omt . 

Figure 142 shows the interesting plot of these functions. Can you give physical explanations for their main 
features? Why do they have the limit 100? Why is y 2 not monotone, whereas v* is? Why is Yi from some time 
on suddenly larger than y 2 ? Etc. ■ 



Fig. 142. Mixing problem in Example 1 
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EXAMPLE 2 


Other systems of ODEs of practical importance can be solved by the Laplace transform 
method in a similar way, and eigenvalues and eigenvectors as we had to determine them 
in Chap. 4 will come out automatically, as we have seen in Example 1. 

Electrical Network 

Find the currents ii(t) and i 2 (/) in the network in Fig. 143 with L and R measured in terms of the usual units 
(see Sec. 2.9), u(/) = 100 volts if 0 ^ ^ 0.5 sec and 0 thereafter, and t(0) = 0, /'(0) = 0. 


L 2 =1H 




Fig. 143. Electrical network in Example 2 

Solution . The model of the network is obtained from Kirchh off’s voltage law as in Sec. 2.9. For the lower 
circuit we obtain 

0.8/| + K'i - is) + l-4ii = 100[l - u(t - i)] 

and for the upper 

l-i^+Uk-ii) = 0 - 


Division by 0.8 and ordering gives for the lower circuit 

/J + 3/l - 1.25/a = 125[l - «(/ - |)] 

and for the upper 


'2 “ '1 


i 2 ~ 0- 


With /*x(0) = 0. i 2 (0) = Owe obtain from (1) in Sec. 6.2 and the second shifting theorem the subsidiary system 


/l e~ sl2 \ 

(s + 3)/, - 1.25/a = 125 I- —I 


-h + (,v + l)/ 2 = 0. 
Solving algebraically for /j and / 2 gives 


125(i+,) (1 - e~ slz ), 

(1 - e~ s '\ 


ll S(s + |)(s + 5) 

h= 125 


s(s + |)(s + 1) 

The right sides without the factor 1 — e~ sl2 have the partial fraction expansions 

500 125 625 

Is 3 (s + i) ~ 21(s + |) 


and 
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EXAMPLE 3 


500 250 250 

Is - 3(5 + i) + 21(5 + £) ’ 

respectively. The inverse transform of this gives the solution for 0 = / = 

125 ,,, 625 _ 7 , ;2 500 

ij(t) = - — e U2 - — e 1U2 + — 

1 3 21 7 

( OS/Si). 

250 _ t/2 250 _ 7t/2 500 

<2(0 Y e + IT e + T 

According to the second shifting theorem the solution for t > J is i^/) - / x (/ — |) and i 2 (/) - / 2 (f — that is, 

/,« = - -y- (I - •"V* - fp (I - 

«>D 

< 2 «) = - ^ (i - «"•)«-*» + ™ (i - * 7/ V _7t/2 

Can you explain physically why both currents eventually go to zero, and why /j(f) has a sharp cusp whereas 
i 2 (t) has a continuous tangent direction at t = |? ■ 

Systems of ODEs of higher order can be solved by the Laplace transform method in a 
similar fashion. As an important application, typical of many similar mechanical systems, 
we consider coupled vibrating masses on springs. 



Fig. 144. Example 3 


Model of Two Masses on Springs (Fig. 144) 

The mechanical system in Fig. 144 consists of two bodies of mass 1 on three springs of the same spring constant 
k and of negligibly small masses of the springs. Also damping is assumed to be practically zero. Then the model 
of the physical system is the system of ODEs 


y" = -*7i + k(y 2 - y{) 
y'i = -* 0>2 ~ yi) ~ h* 


Here and y 2 are the displacements of the bodies from their positions of static equilibrium. These ODEs follow 
from Newton’s second law, Mass X Acceleration = Force, as in Sec. 2.4 for a single body. We again regard 
downward forces as positive and upward as negative. On the upper body, — kyi is the force of the upper spring 
and k{y 2 — y\) that of the middle spring, y 2 — yi being the net change in spring length — think this over before 
going on. On the lower body, - k{y 2 - yj) is the force of the middle spring and — ky 2 that of the lower spring. 
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We shall determine the solution corresponding to the initial conditions >* 1 (0) = 1 , y 2 (0) = 1 , y[( 0) = V3£, 
y 2 (0) “ — VS. Let = Z£{y x ) and V 2 = S£(y 2 ). Then from (2) in Sec. 6.2 and the initial conditions we obtain 
the subsidiary system 

s 2 Y x - j - VS = -kY 1 + k(Y 2 - Y x ) 
s 2 Y 2 - s + VS = -k(Y 2 - Ti) - kY 2 . 

This system of linear algebraic equations in the unknowns Y x and Y 2 may be written 

(s 2 + 2*)*! - kY 2 = s + VS 
-kY x 4 - (s 2 + 2 k)Y 2 = s- VS. 


Elimination (or Cramer's rule in Sec. 7.7) yields the solution, which we can expand in terms of partial fractions. 

_ (s + Vik)(s 2 + 2k) + k{s - V3fc) _ .t Vtt 
Yl ~ (,s 2 + 2kf ~ k z ~ s 2 + k + s 2 + 3 k 


_ (s 2 + mis - Vm + us + V3I) _ ^ Vat 

* 2 ~ (.t 2 + 2kf - k 2 ~ s 2 + k s 2 + 3k ' 

Hence the solution of our initial value problem is (Fig. 145) 

y x {t) = %~\ y x ) = cos Vkt + sin VS/ 
y 2 (/) = i£“ 1 (y 2 ) = cos Viet - sin VS/. 

We see that the motion of each mass is harmonic (the system is undamped!), being the superposition of a “slow” 
oscillation and a “rapid” oscillation. H 



Fig. 145. Solutions in Example 3 




1-20 


SYSTEMS OF ODES 


Using the Laplace transform and showing the details of 
your work, solve the initial value problem: 


i- y[ = ~yi - 3*2* y'2 = >’1 - j’ 2 . 
n(0) = 0, 3-2(0) = 1 


2. 3-( = 5^! + )-2, 3*2 = 3-1 + 53*2, 
3’i(0) = 1, 3-2(0) = “3 


3. y[ = -63-! + 43> 2 , 3-2 = —43’! + 43-2, 

3-i(0) = -2, J2(0) = "7 

4 . 3 -i + 3-2 = 0 , .Vi +3-2 = 2 cos r, 

3’i(0) = 1. 3-2W = 0 

5 . 3* x = - 4 >-! - 23» z + t, 3-2 = 33»| + 3’ 2 - t, 

3-i(0) = 5.75. 3> 2 (0) = -6.75 

6 . y\ = 43 - 2-8 cos 4 t, 3-2 = - 33 -! - 9 sin 4 /, 

3 -i( 0 ) = 0 , 3 - 2 ( 0 ) = 3 
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7. y[ = Sy x ~ 4 y 2 - 9 / 2 + 2/, 

y 2 = 10?! - 7y 2 “ 17/ 2 - 2/, 

?i(0) = 2, ? 2 (0) = 0 

8. y[ = 6?! + y 2> )4 = 9y x + 6y 2 , 

*(0) = -3, .y 2 (0) = -3 

9. y[ = 5?! 4- 5y 2 — 15 cos / + 27 sin /, 
v 2 = “10?! - 5? 2 _ 150 sin/, 

)’i(0) = 2, ? 2 (0) = 2 

10. y i = -2?! + 3)> 2 , >4 = 4y x - ? 2 , 

.Vi(0) = 4, ? 2 (0) = 3 

11. )4 = ?2 + 1 “ w(/ “ 1)> 

y 2 = ~.vi + 1 - «(' - i), *(0) = 0, 

y 2 (0) = 0 

12. ?i = 2>! 4- > 2 . ?i = 4?! 4- 2y a 4- 64/t/(/ - i), 

?i(0) = 2, y 2 (0) = 0 

13. y[ = v x 4- 6w(/ - 2)<? 4t , y 2 = )T + 2? 2 , 

?i(0) = 0, y 2 (0) = 1 

14. y[ = ~y 2 , y 2 = -?i + 2[1 - u(t - 277)] cos/, 

Vi(0) = 1, y 2 (0) = 0 

15. y[ = -3?! + y 2 4- u(t - \)e\ 

)4 = 4?! 4- 2y 2 + u{t “ l)e l , 

.Vi(0) = 0, ? 2 (0) = 3 

16. y 1 = -2?! + 2? 2f y 2 = 2y x - 5y 2# 

.Vi(0) = 1, ?i(0) = 0, ? 2 (0) = 3, ?i(0) = 0 

17. y'{ = 4 ?! •!• 8v 2 , ? 2 = 5? x 4- y 2 , 

?i(0) = 8, ?i(0) = -18, y 2 (0) = 5, 

) 4 ( 0 ) = -21 

18. yj + ? 2 = —101 sin 10/, y 2 4- y t = 101 sin 10/, 

v,(0) = 0, y'i(0) = 6“ ? 2 (0) = 8, 

ykO) = ~6 

19. y[ 4- ? 2 = 2e l 4- <?“*, y 2 4- y' 3 = 2 sinh t, 

y 3 + y[ = e % 

) ? i(0) = 0, ? 2 (0) = 1, v 3 (0) = 1 

20. 4?; 4- ?J - 2)4 = 0, -2?i + ? 3 = 1, 

2)4 “ 4)4 = -16/ 

)’i(0) = 2, y 2 (0) = 0, ? 3 (0) = 0 

21. TEAM PROJECT. Comparison of Methods for 
Linear Systems of ODEs. 

(a) Models. Solve the models in Examples I and 2 of 
Sec. 4.1 by Laplace transforms and compare the 
amount of work with that in Sec, 4.1 . (Show the details 
of your work.) 

(b) Homogeneous Systems. Solve the systems (8), 
(1 1)— (13) in Sec. 4.3 by Laplace transforms. (Show the 
details.) 

(c) Nonhomogeneous System. Solve the system (3) 
in Sec. 4.6 by Laplace transforms. (Show the details.) 


FURTHER APPLICATIONS 

22. (Forced vibrations of two masses) Solve the model in 
Example 3 with k — 4 and initial conditions ) ; 1 (0) = 1, 
>>{(0) = 1 , )> 2 (0) = 1 , ? 2 (0) = — 1 under the assumption 
that the force 11 sin / is acting on the first body and the 
force - 1 1 sin / on the second. Graph the two curves on 
common axes and explain the motion physically. 

23. CAS Experiment. Effect of Initial Conditions. In 

Prob. 22, vary the initial conditions systematically, 
describe and explain the graphs physically. The great 
variety of curves will surprise you. Are they always 
periodic? Can you find empirical laws for the 
changes in terms of continuous changes of those 
conditions? 

24. (Mixing problem) What will happen in Example 1 if 
you double all flows (in particular, an increase to 
12 gal/min containing 12 lb of salt from the outside), 
leaving the size of the tanks and the initial conditions 
as before? First guess, then calculate. Can you relate 
the new solution to the old one? 

25. (Electrical network) Using Laplace uansforms, find 
the currents i x (t) and i 2 {t) in Fig. 146, where 
v(t) = 390 cos; and ^(O) = 0, i 2 (0) = 0. How 
soon will the currents practically reach their steady 
state? 


4ft 8 Q 



r— vw— 1 

L ^ 

< 

> < 

> 8£2 

1 1 

> 


» — '"WF' — 


2 H 4 H 

Network 



Fig. 146. Electrical network and 
currents in Problem 25 


26. (Single cosine wave) Solve Prob. 25 when the EMF 
(electromotive force) is acting from 0 to 27r only. Can 
you do this just by looking at Prob. 25, practically 
without calculation? 
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6.8 Laplace Transform: General Formulas 


Formula 

Name, Comments 

Sec. 

00 

F(s) = m(t)} = f e~ st f(!) dt 
J o 

m = s-m/wi 

Definition of Transform 
Inverse Transform 

6.1 

2{af(t) + bg(t)} = aX{f(t)} + b£{g(t)} 

Linearity 

6.1 

%{e at f(t)} = F(s - a) 
%-HF(s - a)} = e at m 

^-Shifting 

(First Shifting Theorem) 

6.1 

m') = s%(f) - m 
2{f) = s 2 X(f) - sf( 0) - /'( 0) 

£(f n) = s n X(f) - i* w_1) /(0) 

f n ~ v (0) 

X^jj(r)dr = y <£(/) 

Differentiation 
of Function 

Integration of Function 

6.2 

(/ * g)U) = f f(r)g(l - t) dr 
•'o 

= f /(/ - r)g(r) dr 
J o 

m * g > = 

Convolution 

6.5 

£{/(/ - a) u(t -a)} = e~ ax F(s) 
<e- 1 {<T < “F(s)} = /(/ - a) «(/ - a) 

r-Shifting 

(Second Shifting Theorem) 

6.3 

3 

tum) = —f’(s) 

Differentiation of Transform 
Integration of Transform 

6.6 

m= j _ e ~p S \ Q e- sl f(t)d, 

/ Periodic with Period p 

6.4 

Project 

16 
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6.9 Table of Laplace Transforms 

For more extensive tables, see Ref. [A9] in Appendix 1. 



(continued) 
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Table of Laplace Transforms ( continued ) 



(continued) 
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Table of Laplace Transforms ( continued ) 



F(s) = <e{/(f)} 

m 

Sec. 

41 

s — a 
In , 

- (e bt - e at ) 



s - b 

t 



s 2 + w 2 

2 


42 

In 9 

s 

y (1 — cos cot) 

6.6 


s 2 - a 2 

2 


43 

In 9 

— (1 — cosh at) 




t 


44 

CO 

1 


arctan — 

— sin cot 



s 

t 


45 

1 

— arccot s 
s 

Si (/) 

App. 

A3.1 


^FKAP TE RE^KE^IEW^QU^ T I O N S AND PROBLEMS 


1. What do we mean by operational calculus? 

2. What are the steps needed in solving an ODE by Laplace 
transform? What is the subsidiary equation? 

3. The Laplace transform is a linear operation. What does 
this mean? Why is it important? 

4. For what problems is the Laplace transform preferable 
over the usual method? Explain. 

5. What are the unit step and Dirac’s delta functions? Give 
examples. 

6. What is the difference between the two shifting 
theorems? When do they apply? 

7. I s2{/«*(/)} = 2{/(r)}2{*(/)}? Explain. 

8. Can a discontinuous function have a Laplace transform? 
Does every continuous function have a Laplace 
transform? Give reasons. 

9. State the transforms of a few simple functions from 
memory. 

10. If two different continuous functions have transforms, 
the latter are different. Why is this practically important? 


11-22 


LAPLACE TRANSFORMS 


Find the transform (showing the details of your work and 
indicating the method or formula you are using): 


11. te 3t 


12. e 1 sin 2/ 


13. sin 2 t 
15. tu(t — 7 r) 

17. e l * cos 2/ 

19. sin t + sinh / 

21. e a£ - (a * b) 

23-J4 1 INVERSE LAPLACE TRANSFORMS 


14. cos 2 4/ 

16. u(t — 2 tt) sin / 

18. (sin cot) * (cos cot) 
20. cosh / - cos t 
22. cosh 2/ - cosh / 


Find the inverse transform (showing the details of your work 
and indicating the method or formula used): 


23. 

10* 

24. 

15 

s 2 + 2 

s* — 4 

25. 

12 

26. 

3s 

s 2 + 4s + 20 

s 2 - 2s + 2 

27. 

5s + 4 „ 

s 2 ^ 

28. 

2s - 10 e 
s 3 

29. 

2s + 4 

30. 

s 2 - 16 

(s 2 + 4s + 5) 2 

(s 2 + 16) 2 

31. 


32. 

180 + 18s 2 + 3s' 
s 7 

33. 

7T 

34. 

2 

s 2 (s 2 + a > 2 ) 

2s 2 + 2s + 1 
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35-50 1 SINGLE ODEs AND SYSTEMS OF ODEs 

Solve by Laplace transforms, showing the details and 

graphing the solution: 

35. y" + y = «(/ - 1), y(0) = 0, 

/( 0 ) = 20 

36. y" + 16y = 4 8{t - if), y(0) = -l, 

y'(0) = 0 

37. y" + 4y = 8 $(/ - 5), y(0) = 10, 

y'(0) = -1 

38. y" + y = «(/ - 2). y(0) = 0. 

,v'(0) = 0 

39. y" + 2/ + lOy = 0, y(0) = 7, 

y'( 0) = -1 

40. y" + 4 y' + 5y = 50/, y(0) = 5, 

y'(0) = -5 

41. y" — y' - 2y = 12w(r - rr) sin /, 

y(0) = 1, y'(0) = -1 

42. y" - 2y' + y = t8{t - 1), 

y(0) = 0, v'(0) = 0 

43. y" - 4y' + 4v = 8(t - 1) - 8(r - 2), 

y(0) = 0, y'(0) = 0 

44. y" + 4y = 8(1 - -it) - 8(r - 2tt), 

y(0) = 1, y'(0) = 0 

45. y[ + y 2 = sin /, y 2 + )’i = — sin /, 

yi(0) = 1 , y 2 (0) = 0 

46. y[ = — 3y x + y 2 — 12/, y 2 = — 4y t + 2y 2 + 12r, 

}’i(0) = 0, y 2 (0) = 0 

47. yi = y 2 , y 2 = -5yi - 2y 2 , 

yi(0) = 0, y 2 (0) = 1 

48. yi = y 2 , y 2 = -4y t + 5(/ - tr), 

yi(0) = 0, y 2 (0) = 0 

49. v” = 4y 2 - 4e l , y 2 = 3yx + y 2 , 

y^O) = 1, yi(0) = 2, y 2 (0) = 2, y£(0) = 3 

50. y" = 1 6y 2 , y 2 = 16y lt 

Ti(0) = 2, yj(0) = 12, y 2 (0) = 6, y 2 (0) = 4 

MODELS OF CIRCUITS AND NETWORKS 

51. ORC-circuit) Find and graph the current «(f) in the RC- 
circuit in Fig. 147, where R = 100 fl, C = 10 -3 F, 
u(f) = 100/ V if 0 < t < 2, o(0 = 200 V if / > 2 and 
the initial charge on the capacitor is 0. 



o(t) 


Fig. 147. RC-circuit 

52. (LC-circuit) Find and graph the charge q(t) and the 
current i(t) in the LC-circuit in Fig. 148, where 
L = 0.5 H, C = 0.02 F, v(t) = 1425 sin 5t V if 


0 < / < 7 r, v(l) = 0 if t > it, and current and charge at 
t = 0 are 0. 


L 

o(0 

Fig. 148. LC-circuit 

53. (RLC-circuit) Find and graph the current i(t) in the 
RLC-circuit in Fig. 149, where R = 1 fl, L = 0.25 H, 
C = 0.2 F, v(l) = 377 sin 20/ V, and current and charge 
at / = 0 are 0. 


C 



o(0 

Fig. 149. RLC-circuit 


54. (Network) Show that by KirchhofFs voltage law 
(Sec. 2.9), the currents in the network in Fig. 150 are 
obtained from the system 

Li[ + R(i l - t 2 ) = v(t) 

R(i 2 - i'i) + >2 = 0. 

Solve this system, where R = 1 fl, L = 2 H, C = 0.5 
F, o(0 = 90<T t/4 V, 0(0) = 0, / 2 (0) = 2 A. 


L 



Fig. 150. Network in Problem 54 


55. (Network) Set up the model of the network in Fig. 151 
and find and graph the currents, assuming that the 
currents and the charge on the capacitor are 0 when the 
switch is closed at / = 0. 


L= 1 H 



Switch J? 2 = 30 Q 


Fig. 151. Network in Problem 55 




Summary of Chapter 6 


269 


Laplace Transforms 


The main purpose of Laplace transforms is the solution of differential equations and 
systems of such equations, as well as corresponding initial value problems. The 
Laplace transform F(s) = ££(/) of a function f(t) is defined by 

f°° 

(1) F(s) = %(f) = e~ sl f{t) dt (Sec. 6.1). 

J o 

This definition is motivated by the property that the differentiation of / with respect 
to t corresponds to the multiplication of the transform F by s; more precisely, 

sea - ') = s£(f) - m 

(2) (Sec. 6.2) 
££(/") = s 2 Z(f) - 5/(0) - /'(0) 

etc. Hence by taking the transform of a given differential equation 

(3) y" + ay r + by = r(t) (a, b constant) 

and writing i t(y) = K(.y), we obtain the subsidiary equation 

(4) (s 2 + as + b)Y = X(r) + 5/(0) + /'( 0) + af(0). 

Here, in obtaining the transform £(r) we can get help from the small table in 
Sec. 6. 1 or the larger table in Sec. 6.9. This is the first step. In the second step we 
solve the subsidiary equation algebraically for F(jt). In the third step we determine 
the inverse transform y(t) = that is, the solution of the problem. This is 

generally the hardest step, and in it we may again use one of those two tables. Y(s) 
will often be a rational function, so that we can obtain the inverse by partial 

fraction reduction (Sec. 6.4) if we see no simpler way. 

The Laplace method avoids the determination of a general solution of the 
homogeneous ODE, and we also need not determine values of arbitrary constants 
in a general solution from initial conditions; instead, we can insert the latter directly 
into (4). Two further facts account for the practical importance of the Laplace 
transform. First, it has some basic properties and resulting techniques that simplify 
the determination of transforms and inverses. The most important of these properties 
are listed in Sec. 6.8, together with references to the corresponding sections. More 
on the use of unit step functions and Dirac’s delta can be found in Secs. 6.3 and 
6.4, and more on convolution in Sec. 6.5. Second, due to these properties, the present 
method is particularly suitable for handling right sides /*(/) given by different 
expressions over different intervals of tune, for instance, when r(t) is a square wave 
or an impulse or of a form such as /•(/) = cos t if 0 ^ t ^ 477 and 0 elsewhere. 

The application of the Laplace transform to systems of ODEs is shown in 
Sec. 6.7. (The application to PDEs follows in Sec. 12.1 1.) 
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Linear Algebra. 
Vector Calculus 


CHAPTER 7 
CHAPTER 8 
CHAPTER 9 
CHAPTER TO 


Linear Algebra: Matrices, Vectors, Determinants. Linear Systems 
Linear Algebra: Matrix Eigenvalue Problems 
Vector Differential Calculus. Grad, Div, Curl 
Vector Integral Calculus. Integral Theorems 


Linear algebra in Chaps. 7 and 8 consists of the theory and application of vectors and 
matrices, mainly related to linear systems of equations, eigenvalue problems, and linear 
transformations. 


Linear algebra is of growing importance in engineering research and teaching because it 
forms a foundation of numeric methods (see Chaps. 20-22), and its main instruments, 
matrices, can hold enormous amounts of data — think of a net of millions of telephone 
connections — in a form readily accessible by the computer. 

Linear analysis in Chaps. 9 and 10, usually called vector calculus, extends differentiation 
of functions of one variable to functions of several variables — this includes the vector 
differential operations grad, div, and curl. And it generalizes integration to integrals over 
curves, surfaces, and solids, with transformations of these integrals into one another, by 
die basic theorems of Gauss, Green, and Stokes (Chap. 10). 

Software suitable for linear algebra (Lapack, Maple, Mathematica, MaUab) can be found 
in the list at the opening of Part E of the book if needed. 

Numeric linear algebra (Chap. 20) can be studied directly after Chap . 7 or 8 because 
Chap. 20 is independent of the other chapters in Part E on numerics. 
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CHAPTER 7 

Linear Algebra: Matrices, 
Vectors, Determinants. 
Linear Systems 


This is the first of two chapters on linear algebra, which concerns mainly systems of 
linear equations and linear transformations (to be discussed in this chapter) and eigenvalue 
problems (to follow in Chap. 8). 

Systems of linear equations, briefly called linear systems, arise in electrical networks, 
mechanical frameworks, economic models, optimization problems, numerics for 
differential equations, as we shall see in Chaps. 21-23, and so on. 

As main tools, linear algebra uses matrices (rectangular arrays of numbers or functions) 
and vectors. Calculations with matrices handle matrices as single objects, denote them by 
single letters, and calculate with them in a very compact form, almost as with numbers, 
so that matrix calculations constitute a powerful “mathematical shorthand”. 

Calculations with matrices and vectors are defined and explained in Secs. 7. 1-7.2. 
Sections 7. 3-7. 8 center around linear systems, with a thorough discussion of Gauss 
elimination, the role of rank, the existence and uniqueness problem for solutions (Sec. 7.5), 
and matrix inversion. This also includes determinants (Cramer’s rule) in Sec. 7.6 (for 
quick reference) and Sec. 7.7. Applications are considered throughout this chapter. The 
last section (Sec. 7.9) on vector spaces, inner product spaces, and linear transformations 
is more abstract. Eigenvalue problems follow in Chap. 8. 

COMMENT. Numeric linear algebra (Secs. 20.1-20.5) can be studied immediately 
after this chapter. 

Prerequisite : None. 

Sections that may be omitted in a short course : 7.5, 7.9. 

References and Answers to Problems: App. 1 Part B, and App. 2. 


7.1 Matrices, Vectors: 

Addition and Scalar Multiplication 

In this section and the next one we introduce the basic concepts and rules of matrix and 
vector algebra. The main application to linear systems (systems of linear equations) begins 
in Sec. 7.3. 
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SEC 7.1 Matrices, Vectors: Addition and Scalar Multiplication 
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EXAMPLE 1 


A matrix is a rectangular array of numbers (or functions) enclosed in brackets. These 
numbers (or functions) are called the entries (or sometimes the elements ) of the matrix. 
For example. 






L 

*12 

*13 | 

0.3 1 

-5' 






! 




*21 

*22 

*23 

L 

0 -0.2 i 6 j 










_*31 

*32 

*33 J 


1 

to 






r 4 i 


1 

ft 

£ 

» 

[«i 

a 2 

03 ]. 


J 


are matrices. The first matrix has two rows (horizontal lines of entries) and three columns 
(vertical lines). The second and third matrices are square matrices, that is, each has as 
many rows as columns (3 and 2, respectively). The entries of the second matrix have two 
indices giving the location of the entry. The first index is the number of the row and the 
second is the number of the column in which the entry stands. Thus, a 2 3 (read a wo three) 
is in Row 2 and Column 3, etc. This notation is standard, regardless of whether a matrix 
is square or not. 

Matrices having just a single row or column are called vectors. Thus the fourth matrix 
in (1) has just one row and is called a row vector. The last matrix in (1) has just one 
column and is called a column vector. 

We shall see that matrices are practical in various applications for storing and processing 
data. As a first illustration let us consider two simple but typical examples. 


Linear Systems, a Major Application of Matrices 

In a system of linear equations, briefly called a linear system, such as 

4 .v : + 6lv 2 + 9*3 = 6 
6*! - lv 3 = 20 

5*i — 8*2 + *3 = U) 

the coefficients of the unknowns x x , x 2 , *3 are the entries of the coefficient matrix, call it A, 



"4 6 9 “ 


"4 6 9 6" 

A = 

6 0-2 

The matrix A = 

6 0 -2 20 


_5 -8 J_ 


.5 -8 1 10 . 


is obtained by augmenting A by the right sides of the linear system and is called the augmented matrix of the 
system. In A the coefficients of the system are displayed in the pattern of the equations. That is, their position 
in A corresponds to that in the system when written as shown. The same is true for A. 

We shall see that the augmented matrix A contains all the information about the solutions of a system, 
so that we can solve a system just by calculations on its augmented matrix. We shall discuss this in great 
detail, beginning in Sec. 7 . 3 . Meanwhile you may verify by substitution that the solution is x 1 = 3 , x 2 = §, 
x 3 =-l. 

The notation x lt ** 2 , *3 for the unknowns is practical but not essential; we could choose x, y, z or some other 
letters. ■ 
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EXAMPLE 2 


Sales Figures in Matrix Form 

Sales figures for three products I, II, III in a store on Monday (M), Tuesday (T), • • • may for each week be 
arranged in a matrix 



M 

T 

W 

Th 

F 

S 



"400 

330 

810 

0 

210 

470" 

I 

A = 

0 

120 

780 

500 

500 

960 

n 


.100 

0 

0 

270 

430 

780 _ 

ui 


If the company has ten stores, we can set up ten such matrices, one for each store. Then by adding corresponding 
entries of these matrices we can get a matrix showing the total sales of each product on each day. Can you think 
of other data for which matrices are feasible? For instance, in transportation or storage problems? Or in recording 
phone calls, or in listing distances in a network of roads? I 


General Concepts and Notations 

We shall denote matrices by capital boldface letters A, B, C, • * • , or by writing the general 
entry in brackets; thus A = [o^]. and so on. By an m x n matrix (read m by n matrix) 
we mean a matrix with m rows and n columns — rows come always first! m X n is called 
the size of the matrix. Thus an m X n matrix is of the form 


(2) A = [a jfc ] = 


a n 

a 12 

... 

ft\ n 

a 21 

a 22 

■ • * 

a 2n 

ftml 

®m2 

. . . 

ft ran , 


The matrices in (1) are of sizes 2 X 3, 3 X 3, 2 X 2, 1 X 3, and 2X1, respectively. 

Each entry in (2) has two subscripts. The first is the row number and the second is the 
column number. Thus a 2 1 is the entry in Row 2 and Column 1. 

If m = n, we call A an n X n square matrix. Then its diagonal containing the entries 
a x i, a 2 2 , • * * , a nn is called the main diagonal of A. Thus the main diagonals of the two 
square matrices in (1) are a xl , a 22 > a 33 and e~ x , 4x , respectively. 

Square matrices are particularly important, as we shall see. A matrix that is not square 
is called a rectangular matrix. 


Vectors 

A vector is a matrix with only one row or column. Its entries are called the components 
of the vector. We shall denote vectors by lowercase boldface letters a, b, • • • or by its 
general component in brackets, a = [aj], and so on. Our special vectors in (1) suggest 
that a (general) row vector is of the form 


a = [«! a z • • • , a n ]. 


For instance, 


a = [-2 5 0.8 0 1]. 
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A column vector is of the form 

bv 

. ^2 
b = 


For instance. 


b = 


4' 
0 

L-7. 


Matrix Addition and Scalar Multiplication 

What makes matrices and vectors really useful and particularly suitable for computers is 
the fact that we can calculate with them almost as easily as with numbers. Indeed, we 
now introduce rules for addition and for scalar multiplication (multiplication by numbers) 
that were suggested by practical applications. (Multiplication of matrices by matrices 
follows in the next section.) We first need the concept of equality. 


DEFINITION 


Equality of Matrices 

Two matrices A = [cij k ] and B = [b jk ] are equal, written A = B, if and only if they 
have the same size and the corresponding entries are equal, that is, 
a n = ^n» a i 2 = ^i 2 > an( 3 so on - Matrices that are not equal are called different. 
Thus, matrices of different sizes are always different. 


EXAMPLE 3 


Equality of Matrices 

Let 


^11 " 12 "! 

and B = 

F4 0" 


.^21 ^22 J 


L3 “I. 


if and only if 

= 4 . 

**12 = 

0. 


«21 = 3. 

«22 = 

-1. 


The following matrices are all different. Explain! 



"4 21 F 4 r 

.1 3 j 1.2 3 . 




DEFINITION 


Addition of Matrices 

The sum of two matrices A = [cij k ] and B = [b j7c ] of the same size is written 
A + B and has the entries aj k H- bj k obtained by adding the corresponding entries 
of A and B. Matrices of different sizes cannot be added. 


As a special case, the sum a 4* b of two row vectors or two column vectors, which must 
have the same number of components, is obtained by adding the corresponding 
components. 
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Addition of Matrices and Vectors 


r-4 6 

3" 

[5 -1 C 

>i n 5 3i 

If A = 


and B = 

. then A + B = 

L o i 

2. 

|_3 I C 

>J L3 2 2j 


A in Example 3 and our present A cannot be added. If a = [5 7 2| and b = [— 6 2 0]. then 
a + b = (— 1 9 2], 

An application of matrix addition was suggested in Example 2. Many others will follow. ■ 


DEFINITION 


Scalar Multiplication (Multiplication by a Number) 

The product of any m X n matrix A = [tf j7c ] and any scalar c (number c) is written 
cA and is the m X n matrix cA = obtained by multiplying each entry of A 
by c. 


Here ( — 1)A is simply written —A and is called the negative of A. Similarly, ( — A:)A is 
written —kA. Also, A + (— B) is written A — B and is called the difference of A and B 
(which must have the same size!). 

EXAMPLE 5 Scalar Multiplication 



“2.7 

-1.8" 


"-2.7 

1 .8“ 

II 

< 
3 1 ^ 

" 3 

-2“ 



"0 

0 “ 

If A = 

0 

0.9 

, then -A = 

0 

-0.9 

0 

1 


0A = 

0 

0 


.9.0 

— 4.5_ 


_— 9.0 

4.5 _ 


JO 

—5. 



.0 

0. 


If a matrix B shows the distances between some cities in miles, 1.609B gives these distances in kilometers. I 

Rules for Matrix Addition and Scalar Multiplication. From the familiar laws for the 
addition of numbers we obtain similar laws for the addition of matrices of the same size 
m X 77, namely, 


(a) 

A + B = B + A 


(b) 

(A + B) + C = A + (B + C) 

(written A + B + C) 

(c) 

A + 0 = A 


(d) 

A + (—A) = 0. 



Here 0 denotes the zero matrix (of size m X /?), that is, the in X n matrix with all entries 
zero. (The last matrix in Example 5 is a zero matrix.) 

Hence matrix addition is commutative and associative [by (3a) and (3b)]. 

Similarly, for scalar multiplication we obtain the rules 

(a) c(A + B) = cA 4- cB 

(b) (c + k) A = cA + kA 

(c) c(kA) = (ck) A 

(d) 1A = A. 


( 4 ) 


(written ckA) 
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PROBLEM SET 7.1 


1-8 


Let 


ADDITION AND SCALAR MULTIPLICATION 
OF MATRICES AND VECTORS 



3 

0 

4“ 


0 

-5 

-3“ 

A = 

-1 

2 

2 

, B = 

-5 

2 

4 


_ 6 

5 

— 4_ 


.-3 

4 

0. 



"0 

2 “ 


6 

r 

C = 

2 

4 

, D = 

—4 

7 


J 

3_ 


.-8 

3. 



2 “ 


“- 4 . 5 “ 


u 

= 

0 

, v = 

0.8 




_-l_ 


L L 2 _ 



15. (General rules) Prove (3) and (4) for general 3X2 
matrices and scalars c and k. 

16. TEAM PROJECT. Matrices in Modeling Networks. 
Matrices have various applications, as we shall see, 
in a form that these problems can be efficiently 
handled on the computer. For instance, they can be 
used to characterize connections in electrical 
networks, in nets of roads, in production processes, 
etc., as follows. 

(a) Nodal incidence matrix. The network in Fig. 152 
consists of 5 branches or edges (connections, numbered 
1, 2, • • *, 5) and 4 nodes (points where two or more 
branches come together), with one node being 
grounded. We number the nodes and branches and give 
each branch a direction shown by an arrow. This we 
do arbitrarily. The network can now be described by a 
“ nodal incidence matrix " A = [o^], where 


Find the following expressions or give reasons why they 

are undefined. 

1. C + D, D + C, 6(D - C), 6C - 6D 

2. 4C, 2D, 4C + 2D, 8C - OD 

3. A + C - D, C - D, D - C, B + 2C + 4D 

4. 2(A + B), 2A + 2B, 5A - A + B + C 

5. 3C - 8D, 4(3A), (4 • 3)A, B - -&A 

6. 5A - 3C, A - B + D, 4(B - 6A), 4B - 24A 

7. 33u, 4v + 9u, 4(v + 2.25u), u - v 

8. A -I- u, 12u -1- lOv, 0(B - v), OB + u 

9. (Linear system) Write down a linear system (as in 
Example 1) whose augmented matrix is the matrix B 
in this problem set. 

10. (Scalar multiplication) The matrix A in Example 2 
shows the numbers of items sold. Find the matrix 
showing the number of units sold if a unit consists of 
(a) 5 items, (b) 10 items? 

11. (Double subscript notation) Write the entries of A in 
Example 2 in the general notation shown in (2). 

12. (Sizes, diagonal) What sizes do A, B, C, D, u, v in 
this problem set have? What are the main diagonals of 
A and B, and what about C? 

13. (Equality) Give reasons why the five matrices in 
Example 3 are different. 

14. (Addition of vectors) Can you add (a) row vectors 
whose numbers of components are different, (b) a row 
and a column vector with the same number of 
components, (c) a vector and a scalar? 


Mjk 


+ 1 if branch k leaves node (7) 

< — 1 if branch k enters node (7) 

„ 0 if branch k does not touch (J). 


Show that for the network in Fig. 152 the matrix A has 
the given form 





Node © 1 

Node © 0 

Node © 0 


Node @ 


-1 


-1 

1 

0 

0 


-10 0 
Oil 
10-1 
0-10 


Fig. 152. Network and nodal incidence 
matrix in Team Project 16(a) 


(b) Find the nodal incidence matrices of the networks 
in Fig. 153. 
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Fig. 153. Networks in Team Project 16(b) 


(c) Graph the three networks corresponding to the 
nodal Incidence matrices 


“1-1 1 -r 

00-1 1 . 
.-1 1 0 0 . 


"10 0“ 
0 -1 I 

-1 1 0 

. 0 0-1. 


1 

0 

-1 

0 


1 10 0 0 “ 
-1 0 0-1 1 

0 0 110 

0 - 1-1 0 - 1 . 


+1 if branch k is in mesh | j | 
and has the same orientation 


Mjk = 


— 1 if branch k is in mesh | j | 
and has the opposite orientation 


0 if branch k is not in mesh m 


and a mesh is a loop with no branch in its interior (or 
in its exterior). Here, the meshes are numbered and 
directed (oriented) in an arbitrary fashion. Show that 
in Fig. 154 the matrix M corresponds to the given 
Figure, where Row 1 corresponds to mesh 1, etc. 



Fig. 154. Network and matrix M in 
Team Project 16(d) 


(d) Mesh incidence matrix. A network can also be 
characterized by the mesh incidence matrix M = [m jk ], 
where 


(e) Number the nodes in Fig. 154 from left to right l, 
2, 3 and the low node by 4. Find the corresponding 
nodal incidence matrix. 


7.2 Matrix Multiplication 

Matrix multiplication means multiplication of matrices by matrices. This is the last 
algebraic operation to be defined (except for transposition, which is of lesser importance). 
Now matrices are added by adding corresponding entries. In multiplication , do we multiply 
corresponding entries? The answer is no. Why not? Such an operation would not be of 
much use in applications. The standard definition of multiplication looks artificial, but 
will be fully motivated later in this section by the use of matrices in “linear 
transformations,” by which this multiplication is suggested. 
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DEFINITION 


EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


Multiplication of a Matrix by a Matrix 

The product C = AB (in this order) of an m X n matrix A = [a jk ] times an rXp 

matrix B = [f? j7c ] is defined if and only if r = n and is then the m X p matrix 

C = fo/J with entries 

” y=l, •••,;?? 

( 1 ) c jk = Qjlblk = Zjlhk + “j2 b 2 k + • • • + <*jn b nk 

1=1 k = 1, • • • ,p. 


The condition r = a means that the second factor, B, must have as many rows as the first 
factor has columns, namely n. As a diagram of sizes (denoted as shown): 

A B = C 

[m X n] [n X r] = [m X r]. 

cj k in (1) is obtained by multiplying each entry in the yth row of A by the corresponding 
entry in the £th column of B and then adding these n products. For instance, 
C 21 = ct 2 ibn 4- n 2 2 ^ 2 i + * * • + a 2 nbnh and so on. One calls this briefly a 
“multiplication of rows into columns See the illustration in Fig. 155, where n = 3. 


to = 4 



Fig. 155. Notations in a product AB = C 


Matrix Multiplication 



" 3 

5 

-r 


“2 

-2 

3 

1“ 


“ 22 

-2 

43 

42“ 

AB = 

4 

0 

2 


5 

0 

7 

8 

= 

26 

-16 

14 

6 


.“6 

-3 

2 . 


9 

-4 

1 

1 . 


.”9 

4 

-37 

-28. 


Here c n = 3 - 2 + 5 * 5 + (— I ) • 9 = 22, and so on. The entry in the box is t '23 = 4 • 3 + 0 * 7 -+ 2 • 1 = 14. 
The product BA is not defined. M 


Multiplication of a Matrix and a Vector 

pi 21 pi [4-3 + 2-51 [221 

Ll sj LsJ Li -3 8-5J L43J 


whereas 



is undefined. M 


Products of Row and Column Vectors 


‘r 


“l” 


’ 3 

6 r 

2 

_4_ 

= [19J, 

2 

A. 

[3 6 1 ] = 

6 

.12 

12 2 
24 4_ 


[3 6 I] 
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EXAMPLE 4 


EXAMPLE 5 


CAUTION! Matrix Multiplication Is Not Commutative, AB =£ BA in General 

This is illustrated by Examples 1 and 2. where one of the two products is not even defined, and by Example 3. 
where the two products have different sizes. But it also holds for square matrices. For instance. 



•-I r 

’ 1 

r 

" 99 

99“ 

. i -l. 

.100 

100. 

.-99 

-99. 


It is interesting that this also shows that AB = 0 does not necessarily imply BA = 0 or A = 0 or B = 0. We 
shall discuss this further in Sec. 7.8, along with reasons when this happens. ■ 


Our examples show that the order of factors in matrix products must always be observed 
very carefully. Otherwise matrix multiplication satisfies rules similar to those for numbers, 


namely. 

(a) 

(&A)B = k(AB) = A(*B) 

written kA B or AkB 


(b) 

A(BC) = (AB)C 

written ABC 

(2) 

(c) 

(A + B)C = AC + BC 



(d) 

C(A + B) = CA + CB 



provided A, B, and C are such that the expressions on the left are defined; here, k is any 
scalar. (2b) is called the associative law. (2c) and (2d) are called the distributive laws. 

Since matrix multiplication is a multiplication of rows into columns, we can write the 
defining formula (1) more compactly as 

(3) c jh = a^-bfc, j = !,*•*, m; k = !,•••,/?, 


where a^ is the yth row vector of A and b /c is the k\h column vector of B, so that in 
agreement with (1), 


a A = [«ji a J2 


'i k 


«Jn] 


— + Oj 2 b 2 h + 


“ 1 “ Mjnbnk.' 


L^nfcJ 


Product in Terms of Row and Column Vectors 

If A = is of size 3X3 and B = [bj k ] is of size 3X4, then 




"ajbi 

a l b 2 

aib 3 

aib 4 " 

(4) 

AB = 

a 2 bj 


^2^3 

a 2 b 4 



- a 3 b l 

^3^2 

a 3 b 3 

a 3 b 4- 


Taking = [3 5 — 1], a 2 = [4 0 2], etc., verify (4) for die product in Example I. ■ 


Parallel processing of products on the computer is facilitated by a variant of (3) for 
computing C = AB, which is used by standard algorithms (such as in Lapack). In this 
method, A is used as given, B is taken in terms of its column vectors, and the product is 
computed columnwise; thus. 


(5) 


AB = A[b! b 2 ••• b p ] = [Abj Ab 2 ••• Ab p ], 
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EXAMPLE 6 


Columns of B are then assigned to different processors (individually or several to each 
processor), which simultaneously compute the columns of the product matrix Ab 1? Ab 2 , etc. 


Computing Products Columnwise by (5) 


To obtain 



0 

4 



from (5), calculate the columns 

[.; ;][-:k:;]' l; 



4 34' 

8 -23. 




of AB and then write them as a single matrix, as 


shown in the first formula on the 


right. 


Motivation of Multiplication by Linear Transformations 

Let us now motivate the “unnatural” matrix multiplication by its use in linear 
transformations. For n = 2 variables these transformations are of the form 


(6*) 


y± — a n x i + a i 2 x 2 


\2 ~ &2l x l F a 22 x 2 


and suffice to explain the idea. (For general n they will be discussed in Sec. 7.9.) For 
instance, ( 6 *) may relate an x^-coordinate system to a yiy 2 -coordinate system in the 
plane. In vectorial form we can write ( 6 *) as 



> 1 " 

a n ci l2 


W 


a i\ x \ F # 12^2 

( 6 ) y = 


— Ax = 



= 



_^ 2 _ 

\_ a 21 a 22_ 


_ A ' 2 _ 


__ a 21 x l F Cl 22 X 2 ~ 


Now suppose further that the x^-system is related to a n^n^-system by another linear 
transformation, say, 




~ X 1 


b\\ bw 

Wl - 


~b u Wx + b X2 w 2 

(7) 

X = 

_*2_ 

= Bw = 

b> 21 b 22 _ 

_w 2 _ 


_b 2 \W\ + b 22 w 2 _ 


Then the y L y 2 -system is related to the vi^vi^-system indirectly via the x^-system, and we 
wish to express this relation directly. Substitution will show that this direct relation is a 
lineax* transformation, too, say, 


( 8 ) y = Cw = 

Indeed, substituting (7) into ( 6 ), we obtain 


C 11 

c 12 


~W 1 ~ 


'CnWi + c 12 w 2 

_ c 21 

c 22_ 


_\\’ 2 _ 


_C 2 l w l + C 22 W 2 _ 


)’i “ a ii(bu w i F bi 2 \v 2 ) F ^12(^21^1 F b 22 w 2 ) 

= (anb u F a 12 b 21 )w 1 + (a u b 12 F a 12 b 22 )w 2 
y 2 = ci 2 i(bii\Vi + bi 2 w 2 ) + ^ 22 (^ 21^1 F 622 ^ 2 ) 

— (^21^11 F #22^21)^1 F (^21^12 F ^22^22)^2* 
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DEFINITION 


EXAMPLE 7 


Comparing this with (8), we see that 

*11 = * 11^11 + * 12^21 *12 = * 11^12 + * 12^22 

*21 = * 21^11 * 22^21 c 22 = * 21^12 * 22 ^ 22 * 


This proves that C = AB with the product defined as in (1). For larger matrix sizes the 
idea and result are exactly the same. Only the number of variables changes. We then have 
m variables y and n variables x and p variables w. The matrices A, B, and C = AB then 
have sizes m X n, n X p, and m X /?, respectively. And the requirement that C be the 
product AB leads to formula ( 1 ) in its general form. This motivates matrix multiplication 
completely . 

Transposition 

Transposition provides a transition from row vectors to column vectors and conversely. 
More generally, it gives us a choice to work either with a matrix or with its transpose, 
whatever will be more practical in a specific situation. 


Transposition of Matrices and Vectors 

The transpose of an m X n matrix A = [a jk \ is the n X m matrix A T (read A transpose) 
that has the first row of A as its first column , the second row of A as its second 
column, and so on. Thus the transpose of A in (2) is A T = [a^] f written out 


(9) A t = K] = 


*11 

*21 


*12 

*22 

• • • a m2 

*1 n 

* 2 « 

&mn 


As a special case, transposition converts row vectors to column vectors and 
conversely. 


Transposition of Matrices and Vectors 


T5 -8 n 


" 5 

4~ 

o 

o 

1 

II 

< 

* 

then A t = 

-8 

0 

A little more compactly, we can write 


I oj 



Note that for a square matrix, the transpose is obtained by interchanging entries that are symmetrically positioned 
with respect to the main diagonal, e.g., a 12 and a 21 , and so on. ■ 
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EXAMPLE 8 


EXAMPLE 9 


Rules for transposition are 


( 10 ) 


(a) (A t ) t = A 

(b) (A + B) T = A T + B T 

(c) (cA) t = cA t 

(d) (AB) t = B t A t . 


CAUTION! Note that in (lOd) the transposed matrices are in reversed order . We leave 
the proofs to the student. (See Prob. 22.) 


Special Matrices 

Certain kinds of matrices will occur quite frequently in our work, and we now list the 
most important ones of them. 

Symmetric and Skew-Symmetric Matrices. Transposition gives rise to two useful 
classes of matrices, as follows. Symmetric matrices and skew-symmetric matrices are 
square matrices whose transpose equals the matrix itself or minus the matrix, respectively: 


(11) A T = A (thus a^j = aj k ), A T = —A (thus a ^ hence a# = 0). 


Symmetric Matrix 


Skew-Symmetric Matrix 


Symmetric and Skew-Symmetric Matrices 



" 20 

120 

200' 


" 0 1 

- 3 ’ 

A = 

120 

10 

150 

is symmetric, and B = 

-1 0 

-2 


.200 

150 

30 . 


<N 

1 

0 . 


is skew-symmetric. 


For instance, if a company has three building supply centers C lt C 2 , C3, then A could show costs, say, for 
handling 1000 bags of cement on center Cj, and Oj h (j ¥* k) the cost of shipping 1000 bags from Cj to C/ c . 
Clearly, aj k = a jy because shipping in the opposite direction will usually cost the same. 

Symmetric matrices have several general properties which make them important. This will be seen as we 
proceed. M 


Triangular Matrices. Upper triangular matrices are square matrices that can have 
nonzero entries only on and above the main diagonal, whereas any entry below the diagonal 
must be zero. Similarly, lower triangular matrices can have nonzero entries only on and 
below the main diagonal. Any entry on the main diagonal of a triangular matrix may be 
zero or not. 


Upper and Lower Triangular Matrices 



“I 

4 

2" 


"2 

0 

<r 

r 1. 

0 

3 

2 


8 

-1 

0 

Lo 2J 

.0 

0 

6. 


.7 

6 

8. 


“3 0 0 0 ’ 

9-3 0 0 

10 2 0 

.1 9 3 6. 


Upper ifumgulur 


Lower triangular 
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Diagonal Matrices. These are square matrices that can have nonzero entries only on 
the main diagonal. Any entry above or below the main diagonal must be zero. 

If all the diagonal entries of a diagonal matrix S are equal, say, c, we call S a scalar 
matrix because multiplication of any square matrix A of the same size by S has the same 
effect as the multiplication by a scalar, that is, 

(12) AS = SA = cA. 

In particular, a scalar matrix whose entries on the main diagonal are all 1 is called a 
unit matrix (or identity matrix) and is denoted by l n or simply by I. For I, formula (12) 
becomes 

(13) AI = IA = A. 


EXAMPLE 10 Diagonal Matrix D. Scalar Matrix S. Unit Matrix I 



‘2 

0 

0“ 


"c 

0 

(f 


"1 

0 

0“ 

D = 

0 

-3 

0 

S = 

0 

c 

0 

f I = 

0 

1 

0 


J> 

0 

0 _ 


_0 

0 

c_ 


_() 

0 

1. 


Applications of Matrix Multiplication 

Matrix multiplication will play a crucial role in connection with linear systems of 
equations, beginning in the next section. For the time being we mention some other simple 
applications that need no lengthy explanations. 


EXAMPLE 11 Computer Production. Matrix Times Matrix 

Supercomp Ltd produces two computer models PC 1 0S6 and PC 1 1 86. The matrix A shows the cost per computer 
(in thousands of dollars) and B the production figures for the year 2005 (in multiples of 10000 units.) Find a 
matrix C that shows the shareholders the cost per quarter (in millions of dollars) for raw material, labor, and 
miscellaneous. 


PC 1086 PCI 186 


Quarter 
2 3 4 


A = 


'1.2 

1.6" 

Raw Components 

r 3 

8 6 91 

0.3 

0.4 

Labor 


2 4 3 J 

_0.5 

0.6_ 

Miscellaneous 




PC 1086 
PCI 186 


Solution . 


Quarter 


1 

2 

3 

4 


"13.2 

12.8 

13.6 

15.6“ 

Raw Components 

3.3 

3.2 

3.4 

3.9 

Labor 

. 5.1 

5.2 

5.4 

6.3. 

Miscellaneous 


Since cost is given in multiples of $1000 and production in multiples of 10 000 units, the entries of C are 
multiples of $10 millions; thus c n = 13.2 means $132 million, etc. ■ 
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EXAMPLE 12 Weight Watching. Matrix Times Vector 

Suppose that in a weight- watch ins program, a person of 185 lb burns 350 cal/hr in walking (3 mph), 500 in 
bycycling (13 mph) and 950 in jogging (5.5 mph). Bill, weighing 185 lb, plans to exercise according to the 
matrix shown. Verify the calculations (W = Walking, B = Bicycling, J = Jogging). 


W B J 


MON 

“1.0 0 0.5“ 


“ 825“ 




”350" 



WED 

1.0 1.0 0.5 




1325 




500 

= 


FRI 

1.5 0 0.5 




1000 




.950. 



SAT 

2.0 1.5 1.0 


2400 


MON 

WED 

FRI 

SAT 


EXAMPLE 13 Markov Process. Powers of a Matrix. Stochastic Matrix 

Suppose that the 2004 state of land use in a city of 60 mi 2 of built-up area is 

C: Commercially Used 25% l: Industrially Used 20% R: Residentially Used 55%. 

Find the states in 2009, 2014, and 20 1 9, assuming that the transition probabilities for 5-year intervals are given 
by the matrix A and remain practically the same over the time considered. 


From C 

From I 

From R 


'0.7 

0.1 

0 " 


ToC 

0.2 

0.9 

0.2 


To 1 

.0.1 

0 

0.8 _ 


ToR 


A is a stochastic matrix, that is. a square matrix with all entries nonnegative and all column sums equal to 1. 
Our example concerns a Markov process 1 , that is. a process for which the probability of entering a certain state 
depends only on the last state occupied (and the matrix A), not on any earlier state. 

Solution . From the matrix A and the 2004 state we can compute the 2009 state. 


"0.7-25 + 0.1 -20 + 0-55" 


"0.7 

0.1 

0" 


"25" 


“19.5" 

0.2 • 25 + 0.9 • 20 + 0.2 • 55 

= 

0.2 

0.9 

0.2 


20 

= 

34.0 

.0.1-25 + 0-20 + 0.8-55. 


.0.1 

0 

0.8. 


.55. 


.46.5. 


To explain: The 2009 Figure for C equals 25% times the probability 0.7 that C goes into C, plus 20% times the 
probability 0.1 that I goes into C, plus 55% times the probability 0 that R goes into C. Together. 

25 • 0.7 + 20 * 0.1 + 55 • 0 = 19.5 [%]. Also 25 • 0.2 + 20 • 0.9 + 55 • 0.2 = 34 [%]. 

Similarly, the new R is 46.5%. We see that the 2009 state vector is the column vector 

y = 119.5 34.0 46.5J T = Ax = A [25 20 55] T 

where the column vector x = [25 20 55] T is the given 2004 state vector. Note that the sum of the entries of 
y is 100 [%J. Similarly, you may verify that for 2014 and 2019 we get the state vectors 

z = Ay = A(Ax) = A 2 x = [17.05 43.80 39.I5| T 
u = Az = A 2 y = A 3 x = [16.315 50.660 33.025f. 


1 ANDREI ANDREJEVITCH MARKOV (1856-1922), Russian mathematician, known for his work in 

probability theory. 
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Answer \ In 2009 the commercial area will be 19.5% (11.7 mi 2 ), the industrial 34% (20.4 mi 2 ) and the 
residential 46.5% (27.9 mi 2 ). For 2014 the corresponding figures are 17.05%, 43.80%, 39.15%. For 2019 they 
are 16.315%, 50.660%, 33.025%. (In Sec. 8.2 we shall see what happens in the limit, assuming that those 
probabilities remain the same. In the meantime, can you experiment or guess?) I 


-EJLO r B = bJLM = 5 €r T r Z — 2 


1 - 14 ] MULTIPLICATION, ADDITION, AND 
TRANSPOSITION OF MATRICES AND 
VECTORS 



Calculate the following products and sums or give reasons 

why they are not defined. (Show all intermediate results.) 

1. Aa, Ab, Ab T , AB 

2. Ab T + Bb T , (A + B)b T , bA, B - B T 

3. AB, BA, AA t , A t A 

4. A 2 , B 2 , (A t ) 2 , (A 2 ) t 

5. a T A, bA, 5B(3a + 2b T ), 15Ba + 10Bb T 

6 . A T b, b T B, (3A - 2B) T a, a T (3A - 2B) 

7. ab, ba, (ab)A, a(bA) 

8. ab — ba, -(4b)(7a), — 28ba, 5abB 

9. (A 4- B) 2 , A 2 4- AB 4- BA + B 2 , A 2 + 2AB 4- B 2 

10. (A 4- B)(A - B), A 2 - AB 4- BA - B 2 , A 2 - B 2 

11. A 2 B, A 3 , (AB) 2 , A 2 B 2 

12. B 3 , BC, (BC) 2 , (BC)(BC) T 

13. a T Aa, a T (A 4- A T )a, bBb T , b(B - B T )b T 

14. a T CC T a, a T C 2 a, bC T Cb T , bCC T b T 

15. (General rules) Prove (2) for 2 X 2 matrices A = 

B = [bj k ], C = [c jk \ and a general scalar. 

16. (Commutativity) Find all 2 X 2 matrices A = [a^] 
that commute with B = [b jlc J, where b jk - j 4- k. 

17. (Product) Write AB in Probs. 1-14 in terms of row 
and column vectors. 

18. (Product) Calculate AB in Prob. 1 columnwise. (See 
Example 6.) 

19. TEAM PROJECT. Symmetric and Skew- 
Symmetric Matrices. These matrices occur quite 
frequently in applications, so it is worthwhile to study 
some of their most important properties. 

(a) Verify the claims in (11) that a kj = a jk for a 
symmetric matrix, and a kj = —a jk for a skew-symmetric 
matrix. Give examples. 



(b) Show that for every square matrix C the matrix 
C 4* C T is symmetric and C — C T is skew-symmetric. 
Write C in the form C = S 4- T, where S is symmetric 
and T is skew-symmetric and find S and T in terms of 
C. Represent A and B in Probs. 1—14 in this form. 

(c) A linear combination of matrices A, B, C, • • • , 
M of the same size is an expression of the form 

(14) aA 4* £B 4- cC 4- • • • 4- mM, 

where a, • ■ ■ , m are any scalars. Show that if these 
matrices are square and symmetric, so is (14); 
similarly, if they are skew-symmetric, so is (14). 

(d) Show that AB with symmetric A and B is 
symmetric if and only if A and B commute, that is, 
AB = BA. 

(e) Under what condition is the product of skew- 
symmetric matrices skew-symmetric? 

20. (Idempotent and nilpotent matrices) By definition, 
A is idempotent if A 2 = A, and B is nilpotent if 
B m = 0 for some positive integer m. Give examples 
(different from 0 or I). Also give examples such that 
A 2 = I (the unit matrix). 

21. (Triangular matrices) Let U lf U 2 be upper triangular 
and Lj, L 2 lower triangular. Which of the following 
are triangular? Give examples. How can you save half 
of your work by transposition? 

U x + U 2 , UiU 2 , Uj 2 , Ux + Lx, U a Lx, L, + L 2 , 

LxL 2 , Lx 2 

22. (Transposition of products) Prove (lOa)-(lOc). 
Illustrate the basic formula (lOd) by examples of your 
own. Then prove it. 

APPLICATIONS 

23. (Markov process) If the transition matrix A has the 

entries a n = 0.5, a X2 — 0.3, a 2 1 = 0.5, = 0,7 and 

the initial state is [1 1] T , what will the next three 

states be? 

24. (Concert subscription) In a community of 300 000 
adults, subscribers to a concert series tend to renew their 
subscription with probability 90% and persons presently 
not subscribing will subscribe for the next season with 
probability 0.1%. If the present number of subscribers 
is 2000, can one predict an increase, decrease, or no 
change over each of the next three seasons? 
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25. CAS Experiment Markov Process. Write a program 
for a Markov process. Use it to calculate further steps in 
Example 13 of the text. Experiment with other stochastic 
3X3 matrices, also using different starting values. 

26. (Production) In a production process, let N mean “no 
trouble” and T “trouble.” Let the transition probabilities 
from one day to the next be 0.9 for N — » N. hence 0. 1 
for 7, and 0.5 for T-* jV, hence 0.5 for 7-» T. 
If today there is no trouble, what is the probability of 
N two days after today? Three days after today? 

27. (Profit vector) Two factory outlets F x and F 2 in New 
York and Los Angeles sell sofas (S), chairs (C), and 
tables (T) with a profit of $110, $45, and $80, 
respectively. Let the sales in a certain week be given by 
the matrix 

C T 
400 1001 F x 

820 205 J F 2 


5 

600 

.300 


Introduce a “profit vector” p such that the components 
of v = Ap give the total profits of F x and F 2 . 

28. TEAM PROJECT. Special Linear Transformations. 
Rotations have various applications. We show in this 
project how they can be handled by matrices. 

(a) Rotation in the plane. Show that the linear 
transformation y = Ax with matrix 


~cos 0 

—sin 0 ~ 



x i 



and 

x - 


sin 0 

cos 0_ 



- V 2. 


y = 



is a counterclockwise rotation of the Cartesian x^- 
coordinate system in the plane about the origin, where 
0 is the angle of rotation. 

(b) Rotation through «0. Show that in (a) 


A B 


cos nO — sinH0 
„sin/i0 cos/?0_ 


Is this plausible? Explain this in words. 


(c) Addition formulas for cosine and sine. By 

geometry we should have 


cos a -sin a 

COS p 

—sin p 

sin a cos a 

l— J 

_sin p 

cos p_ 


cos (a 4- P) —sin (a 4- P) 
sin (a 4- p) cos (a 4- p)_ 

Derive from this the addition formulas (6) in App. A3. 1 . 

(d) Computer graphics. To visualize a three- 
dimensional object with plane faces (e.g.. a cube), we 
may store the position vectors of the vertices with 
respect to a suitable Ajjr^Vs-coordinate system (and a 
list of the connecting edges) and then obtain a two- 
dimensional image on a video screen by projecting 
the object onto a coordinate plane, for instance, onto 
the X]* 2 -plane by sett * n £ *3 = 0 * T 0 change the 
appearance of the image, we can impose a linear 
transformation on the position vectors stored. Show 
that a diagonal matrix D with main diagonal entries 
3, 1. 1 gives from an x = [xj] the new position vector 
y = Dx, where y t = 3.v a (stretch in the ^-direction 
by a factor 3), v 2 = a 2 (unchanged), y 3 = \x z 
(contraction in the A 3 -direction). What effect would a 
scalar matrix have? 

(e) Rotations in space. Explain y = Ax geometrically 
when A is one of the three matrices 


"1 0 0 " 

0 cos 0 -sin 0 , 

_0 sin 6 cos 0 _ 


”cos (p 

0 

—sin <p~ 


"cos l ff 

—sin i/t 

O' 

0 

1 

0 


sin iff 

cos if/ 

0 

.sin (p 

0 

cos <p ._ 


0 

0 i_ 


What effect would these transformations have in 
situations such as that described in (d)? 


7.3 Linear Systems of Equations. 

Gauss Elimination 

The most important use of matrices occurs in the solution of systems of linear equations , 
briefly called linear systems. Such systems model various problems, for instance, in 
frameworks, electrical networks, traffic flow, economics, statistics, and many others. In 
this section we show an important solution method, the Gauss elimination. General 
properties of solutions will be discussed in the next sections. 
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Linear System, Coefficient Matrix, Augmented Matrix 

A linear system of m equations in n unknowns jr lf • • • , x n is a set of equations of the form 

+ • • ’ + Cl ln*n = h 

^21-^1 * * * "l" a 2jl X n b 2 


^ml^l + * • • + a mn x n b m 


The system is called linear because each variable Xj appears in the first power only, just 
as in the equation of a straight line. a n , • • • , a mn are given numbers, called the 
coefficients of the system. b x , • • • , b m on the right are also given numbers. If all the bj 
are zero, then (1) is called a homogeneous system. If at least one bj is not zero, then (1) 
is called a nonhomogeneous system. 

A solution of (1) is a set of numbers x x , • • • , x n that satisfies all the m equations. 
A solution vector of (1) is a vector x whose components form a solution of (1). If the 
system (1) is homogeneous, it has at least the trivial solution x 1 = 0, • • • , x n = 0. 

Matrix Form of the Linear System (1). From the definition of matrix multiplication 
we see that the m equations of (1) may be written as a single vector equation 

(2) Ax = b 

where the coefficient matrix A = [a jk ] is the m X n matrix 


a n a 12 
a 21 a 22 


a ln 


#2n 


and x 



rM 


and b = 


L^ml &m2 * * * ^mnj 




L * 


are column vectors. We assume that the coefficients aj k are not all zero, so that A is not 
a zero matrix. Note that x has n components, whereas b has m components. The matrix 


~tfll 

• • • a m 1 

i 

i 

V 


i 

G-mn 1 

1 


is called the augmented matrix of the system (1). The dashed vertical line could be 
omitted (as we shall do later); it is merely a reminder that the last column of A does not 
belong to A. 

The augmented matrix A determines the system (1) completely because it contains all 
the given numbers appearing in (1). 
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EXAMPLE 1 



Infinitely 
many solutions 



No solution 


Fig. 156. Three 
equations in 
three unknowns 
interpreted as 
planes in space 


Geometric Interpretation. Existence and Uniqueness of Solutions 

If m = n = 2. we have two equations in two unknowns x v x 2 

«11*1 + <*12*2 = ^1 
<*2l*l <*22*2 = ^2- 

If we interpret x 2 as coordinates in the A 1 A 2 -plane, then each of the two equations represents a straight line, 
and (a*!. a 2 ) is a solution if and only if the point P with coordinates a x , a 2 lies on both lines. Hence there are 
three possible cases: 

(a) Precisely one solution if the lines intersect. 

( b ) Infinitely many solutions if the lines coincide. 

( c ) No solution if the lines are parallel 
For instance. 


.Vi + x 2 = 1 
2*i - x 2 = 0 
Case (a) 

*2 I 



x l + x 2 = 1 
2a j + 2x 2 — 2 
Case (6) 



*i +x 2~ 1 

x \ + *2 = 0 
Case (c) 



If the system is homogenous. Case (c) cannot happen, because then those two straight lines pass through the 
origin, whose coordinates 0, 0 constitute the trivial solution. If you wish, consider three equations in three 
unknowns as representations of three planes in space and discuss the various possible cases in a similar fashion. 
See Fig. 156. B 

Our simple example illustrates that a system (1) may perhaps have no solution. This poses 
the following problem. Does a given system (1) have a solution? Under what conditions 
does it have precisely one solution? If it has more than one solution, how can we 
characterize the set of all solutions? How can we actually obtain the solutions? Perhaps 
the last question is the most immediate one from a practical viewpoint. We shall answer 
it first and discuss the other questions in Sec. 7.5. 

Gauss Elimination and Back Substitution 

This is a standard elimination method for solving linear systems that proceeds 
systematically irrespective of particular features of the coefficients. It is a method of great 
practical importance and is reasonable with respect to computing time and storage demand 
(two aspects we shall consider in Sec. 20.1 in the chapter on numeric linear algebra). We 
begin by motivating the method. If a system is in “triangular form,” say, 

2x 1 + 5a* 2 = 2 

13a* 2 = -26 


we can solve it by ‘T)ack substitution,” that is, solve the last equation for the variable, 
x 2 = “26/13 = -2, and then work backward, substituting a * 2 = -2 into the first equation 
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EXAMPLE 2 


and solve it for * lt obtaining x x = |(2 - 5* 2 ) = |(2 - 5 • (—2)) = 6. This gives us the idea 
of first reducing a general system to triangular form. For instance, let the given system be 

2x t 4 5* 2 = 2 [2 

Its augmented matrix is 

-4x 1 4- 3jc 2 = “30. L“4 

We leave the first equation as it is. We eliminate x x from the second equation, to get a triangular 
system. For this we add twice the first equation to the second, and we do the same operation 
on the rows of the augmented matrix. This gives -4x x 4 4*! 4 3* 2 + 10* 2 = “30 + 2*2, 
that is. 



2*i 4 5* 2 = 2 [2 5 2 ~ 

13* 2 = -26 Row 2 4 2 Row 1 _0 13 — 26_ 

where Row 2 4 2 Row 1 means “Add twice Row 1 to Row 2” in the original matrix. 
This is the Gauss elimination (for 2 equations in 2 unknowns) giving the triangular form, 
from which back substitution now yields * 2 = —2 and x 1 = 6, as before. 

Since a linear system is completely determined by its augmented matrix, Gauss 
elimination can be done by merely considering the matrices, as we have just indicated. 
We do this again in the next example, emphasizing the matrices by writing them first and 
the equations behind them, just as a help in order not to lose track. 

Gauss Elimination. Electrical Network 

Solve the linear system 

*1 “ x 2 + - v 3 - 0 

“A'l + A*2 - -V 3 = 0 

IO.V 2 “t* 25 a* 3 = 90 

20*! + 10.v 2 = 80. 

Derivation from the circuit in Fig . 157 (Optional). This is the system for the unknown currents 
*1 = 1 * 1 , a 2 = i 2 » - v 3 = *3 in the electrical network in Fig. 157. To obtain it, we label the currents as shown, 
choosing directions arbitrarily: if a current will come out negative, this will simply mean that the current flows 
against the direction of our arrow. The current entering each battery will be the same as the current leaving it. 
The equations for the currents result from Kirchhoff s laws: 

Kirchhoffs current law (KCL). At any point of a circuit, the sum of the inflowing currents equals the sum 
of the outflowing currents. 

Kirchhoffs voltage law (KVL). In any closed loop, the sum of all voltage drops equals the impressed 
electromotive force. 

Node P gives the first equation, node Q the second, the right loop the third, and the left loop the fourth, as 
indicated in the figure. 


80 V 



15 Q 


NodeF: 
Node Q : 
Right loop: 


,: 2 + f 3= 0 


"‘l + »z“ '3= 0 


Left loop: 20^ + 10 i 2 = 80 


Fig. 157. Network in Example 2 and equations relating the currents 
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Solution by Gauss Elimination . This system could be solved rather quickly by noticing its particular 
form. But this is not the point. The point is that the Gauss elimination is systematic and will work in general, 
also for large systems. We apply it to our system and then do back substitution. As indicated let us write the 
augmented matrix of the system first and then the system itself: 


Augmented Matrix A 


Pivot 1 



© 

1 -1 

I 1 

0“ 




-l 

1 

-i ! 

0 

Eliminate— 



0 

10 

25 | 

90 



- 

20 

10 

0 ! 

80. 




Equations 

Pivot 1 >( *i)~ v 2 + -v 3 


-*i 

+ *2 - *3 

Eliminate » 


10*2 + 25*3 


20 *! 

+ 10*2 


0 

0 

90 

80. 


Step 1. Elimination of x 1 

Call the first row of A the pivot row and the first equation the pivot equation. Cali the coefficient 1 of its 
A'i-term the pivot in this step. Use this equation to eliminate (get rid of xj) in the other equations. For this, do: 

Add l times the pivot equation to the second equation. 

Add —20 times the pivot equation to the fourth equation. 

This corresponds to row operations on the augmented matrix as indicated in BLUE behind the new matrix in 
(3). So the operations are performed on the preceding matrix. The result is 


"1 

-1 

1 

1 

1 

0 " 


*i — * 2 + *3=0 

0 

0 

0 

1 

1 

i 

0 

Row 2 4- Row 1 

o 

II 

o 

0 

10 

25 

i 

i 

1 

90 


10 * 2 + 25a 3 = 90 

.0 

30 

-20 

l 

1 

80. 

Row 4-20 Row 1 

30uc 2 ~ 20*3 = 80. 


Step 2. Elimination of x z 

The first equation remains as it is. We want the new second equation to serve as the next pivot equation. But 
since it has no .v 2 -term (in fact, it is 0 = 0 ), we must first change the order of the equations and the corresponding 
rows of the new matrix. We put 0 = 0 at the end and move the third equation and the fourth equation one place 
up. This is called partial pivoting (as opposed to the rarely used total pivoting , in which also the order of the 
unknowns is changed). It gives 



“1 

-1 

1 

1 

1 

0 " 

*1 - 

* 2 + -«3 

= 

0 

Pivot 10 * 

0 

© 

25 

1 

1 

1 

90 

Pivot 10 > 

(l 0 * 2 ) + 25*3 

= 

90 

Eliminate 30 — > 

0 

Ho] 

-20 

1 

l 

80 

Eliminate 30* 2 * 

30*2 — 20*3 

= 

80 


.0 

0 

0 

1 

1 

0 . 


0 

= 

0 

To eliminate * 2 , do: 









Add —3 times the 

pivot equation to 

the third equation. 




The result is 











"1 

-1 

1 [ 


0 “ 

-V'l 

- -v 2 + *3 

= 



0 

10 

25 | 


90 


10*2 + 25*3 

= 


(4) 



1 








0 

0 

-95 | 

-190 

Row 3-3 Row 2 

- 95*3 

= 

-1 


.0 

0 

o! 


oj 


0 

= 



Back Substitution . Determination of * 3 , x z , x x (in this order) 

Working backward from the last to the first equation of this “triangular” system (4), we can now readily find 
a* 3 , then and then xj: 

-95a- 3 = "190 -Yg = j 3 = 2 [Aj 

10*2 + 25 a ' 3 = 90 * 2 = ^(90 - 25 * 3 ) = / 2 = 4 [A] 

• v i - a- 2 + *3 = 0 *1 = *2 - *3 = /! = 2 fA] 

where A stands for “amperes.” This is the answer to our problem. The solution is unique. B 
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Elementary Row Operations. Row-Equivalent Systems 

Example 2 illustrates the operations of the Gauss elimination. These are the first two of 
three operations, which are called 

Elementary Row Operations for Matrices: 

Interchange of two rows 

Addition of a constant multiple of one row to another row 
Multiplication of a row by a nonzero constant c. 

CAUTION! These operations are for rows, not for columns! They correspond to the 
following 

Elementary Operations for Equations: 

Interchange of two equations 

Addition of a constant multiple of one equation to another equation 
Multiplication of an equation by a nonzero constant c. 

Clearly, the interchange of two equations does not alter the solution set. Neither does that 
addition because we can undo it by a corresponding subtraction. Similarly for that 
multiplication, which we can undo by multiplying the new equation by 1/c (since c =£ 0), 
producing the original equation. 

We now call a linear system S y row-equivalent to a linear system S 2 if Si can be 
obtained from S 2 by (finitely many!) row operations. Thus we have proved the following 
result, which also justifies the Gauss elimination. 


THEOREM 1 


Row-Equivalent Systems 

Row-equivalent linear systems have the same set of solutions. 


Because of this theorem, systems having the same solution sets are often called 
equivalent systems . But note well that we are dealing with row operations . No column 
operations on die augmented matrix are permitted in this context because they would 
generally alter the solution set. 

A linear system (1) is called overdetermined if it has more equations than unknowns, 
as in Example 2, determined if m = n, as in Example 1, and underdetermined if it has 
fewer equations than unknowns. 

Furthermore, a system (1) is called consistent if it has at least one solution (thus, one 
solution or infinitely many solutions), but inconsistent if it has no solutions at ail, as 
*1 + *2 = 1, Xi + x 2 = 0 in Example 1. 


Gauss Elimination: The Three Possible Cases of Systems 

The Gauss elimination can take care of linear systems with a unique solution (see Example 
2), with infinitely many solutions (Example 3, below), and without solutions (inconsistent 
systems; see Example 4). 
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EXAMPLE 3 


EXAMPLE 4 


Gauss Elimination if Infinitely Many Solutions Exist 

Solve die following linear systems of three equations in four unknowns whose augmented matrix is 


"3.0 

2.0 

2.0 

-5.0 

1 

1 

8.0" 



0} 

+ 2.0a 2 + 2.0a 3 — 5.0x4 = 

8.0 

0.6 

1.5 

1.5 

-5.4 

1 

l 

2.7 

Thus, 


0.6*! 

+ 1 .5x 2 + 1 .5 .v 3 — 5.4x 4 = 

2.7 

.1.2 

-0.3 

-0.3 

2.4 

1 

1 

2.1. 



Lit! 

- 0.3.v 2 - 0.3 x 3 + 2.4x 4 = 

2.1. 


Solution . As in the previous example, we circle pivots and box terms of equations and corresponding entries 
to be eliminated. We indicate the operations in terms of equations and operate on both equations and matrices. 

Step /. Elimination of ^ from the second and third equations by adding 

— 0.6/3.G = —0.2 times the first equation to the second equation, 

— 1. 2/3.0 — —0.4 times the first equation to the third equation. 


This gives the following, in which the pivot of the next step is circled. 



"3.0 

2.0 

2.0 

-5.0 1 8.0' 

1 

3.0a*i + 2.0x 2 + 

2.0x 3 — 5.0a * 4 = 

8.0 

( 6 ) 

0 

1.1 

1.1 

-4.4 | 1.1 

Row 2 - 0.2 Row 1 (j.lx 2 )-f 

l.lxj ~ 4 . 4 x 4 = 

l.l 


0 

-1.1 

-1.1 

4.4 1 -1.1. 

Row 3 - 0.4 Row l |- \.\x 2 \- 

l.lx 3 4* 4.4x 4 = 

-1.1 

Step 2. Elimination of x 2 from the third equation of ( 6 ) by adding 






LI/ 1. 1 = 1 times the second equation to the third equation. 



This gives 








"3.0 

2.0 

2.0 

Ul 

O 

OO 

b 

3-O.vj -f 2 . 0 x 2 + 

2.0x 3 — 5.0x 4 = 

8.0 

(7) 

0 

1.1 

1.1 

-4.4 | 1.1 

1.1*2 + 

1.1 A 3 — 4.4x 4 = 

1.1 


0 

0 

0 

0 1 0. 

Row 3 + Row 2 

0 = 

0 . 


Back Substitution. From the second equation, ,r 2 — l — *3 + 4.v 4 . From this and the first equation, 
xi = 2 - a* 4 . Since .v 3 and .v 4 remain arbitrary, we have infinitely many solutions. If we choose a value of 
x 3 and a value of x 4 , then the corresponding values of x t and .v 2 are uniquely determined. 

Ott Notation. If unknowns remain arbitrary, it is also customary to denote them by other letters t v t 2 , • * * . 
In this example we may thus write x\ = 2 — x 4 = 2 - / 2 , x 2 = 1 — x 3 + 4x 4 = 1 — /j -f 4/ 2 , x 3 = / x (first 
arbitrary unknown), x 4 = / 2 (second arbitrary unknown). H 


Gauss Elimination if no Solution Exists 

What will happen if we apply the Gauss elimination to a linear system that has no solution? The answer is that 
in this case the method will show this fact by producing a contradiction. For instance, consider 


"3 2 1 1 3" 

1 

0 

+ 2.V 2 + A*3 = 3 

2 1 1 { 0 


2xi 

0 

II 

i? 

+ 

+ 

_6 2 4 16. 


6x x 

+ 2x 2 + 4 a 3 = 6. 


Step 1. Elimination of x 1 from the second and third equations by adding 

— § times the first equation to the second equation, 

— § = -2 times the first equation to the third equation. 


This gives 


‘3 

0 

.0 


2 

.1 

3 


1 ! 3 " 

§ J— 2 Row2-§Rowl 

1 

2 1 0 J Row 3-2 Row I 


3a*i -I- Zv 2 + a 3 = 3 

3 A 3 = “2 

1~ 2.v 2 1 + 2.Y3 = 0. 
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Step 2. Elimination of x% from the third equation gives 


“3 

2 

1 

1 3] 
1 


3x 1 + lv 2 + a *3 = 3 

0 

_i 

3 

3 

[-2 


- s-v 2 + £.v 3 = -2 

.0 

0 

0 

1 

l 12. 

Row 3 - 6 Row 2 

0= 12. 


The false statement 0=12 shows that the system has no solution. I 

Row Echelon Form and Information From It 

At the end of the Gauss elimination the form of the coefficient matrix, the augmented 
matrix, and the system itself are called the row echelon form. In it, rows of zeros, if 
present, are the last rows, and in each nonzero row the leftmost nonzero entry is farther 
to the right than in the previous row. For instance, in Example 4 the coefficient matrix 
and its augmented in row echelon form are 


"3 

2 

r 


' 3 

2 

1 

1 

1 

1 

3“ 

0 

3 

i 

3 

and 

0 

i 

3 

1 

3 

1 

1 

1 

1 

-2 

_0 

0 

0. 


. 0 

0 

0 

1 

1 

12_ 


Note that we do not require that the leftmost nonzero entries be 1 since this would have 
no theoretic or numeric advantage. (The so-called reduced echelon form , in which those 
entries ore 1, will be discussed in Sec. 7.8.) 

At the end of the Gauss elimination (before the back substitution) the row echelon form 
of the augmented matrix will be 



Here, r ^ m and a u =£ 0, c^ & 0, • • • , * 0, and all the entries in the blue triangle 

as well as in the blue rectangle are zero. From this we see that with respect to solutions 
of the system with augmented matrix (8) (and thus with respect to the originally given 
system) there are three possible cases: 

(a) Exactly one solution if r = n and h r+] , ■ • , b m> if present, are zero. To get the 
solution, solve the nth equation corresponding to (8) (which is k nn x n = b n ) for x n> then 
the (n — l)st equation for A n _ x , and so on up the line. See Example 2, where r = n = 3 
and m = 4. 

(b) Infinitely many solutions if r < n and b r+i , • • • , b mr if present, are zero. To obtain 
any of these solutions, choose values of a ? . + ! , • • • , x n arbitrarily. Then solve the /th equation 
for x r9 then the ( r — l)st equation for x r _ ly and so on up the line. See Example 3. 

(c) No solution if r < m and one of the entries b r+u • • • , b m is not zero. See Example 
4, where r = 2 < m = 3 and ? r+ , = b 3 = 12. 
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M6| gauss elimination and back 

SUBSTITUTION 


Solve the following systems or indicate the nonexistence of 
solutions. (Show the details of your work.) 

1. 5a - 2y = 20.9 

2. 3.0a 4 6.2y = 0.2 

-A + 4 y = -19.3 

2.1a 4 8.5.y = 4.3 

3. 0.5a 4 3.5y = 5.7 

4. 4y — 2z = 2 

-a 4 5.0y = 7.8 

6a - 2y 4 z — 29 


4a 4 8y - 4z = 24 

5. 0.8a 4 1.2y - 0.6z = 

-7.8 

2.6a 4 1.7 z = 

15.3 

4.0a - 7.3y - l.5z = 

1.1 

6. 14a — 2 y — 4z = 0 

7. y 4 z = — 2 

18a - 2y - 6z = 0 

4y 4 6z = -12 

4a 4 8y - 14z = 0 

a 4 y 4 z = 2 

8. 2a 4 y — 3z = 8 

9. 4y 4 4z = 24 

5a 4 2z = 3 

3a - 1 ly - 2z = -6 

8a - y 4 lz = 0 

ON 

<■: 

1 

4 

M 

II 

00 


10. 0.6a- 4 03y - OAz = -1.9 
—4.6a + 0.5y + 1.2z = -1.3 

11. 2a - y 4 3z = -1 

—4a 4 2y — 6z = 2 

12. — 2>* - 2z = -8 
3a 4 4y - 5z = 13 

13. a 4 y — 2z = 0 

— 4vv - a - y 4 2z = — 4 
— 2vv 4- 3 a 4- 3y — 6z = —2 

14. iv — 2 a 4 5v — 3z = 0 
— 3iv 4 6a 4 7 + z = 0 

2 iv - 4 a 4- 3 y — z = 3 


15. 3a -1- 7y - 4z = -46 

5iv 4 4 a 4 8y 4 z — 7 

8 iv 4 4y — 2z = 0 

— iv 4- 6a 4 2z = L3 


16. — 2w — 

17a- + 4y + 3i = 

0 

7vv 

+ 3>- -2z = 

0 


2a + 8>- -6 z = 

-20 

5 iv — 

13a — y + 5z = 

16 


[rM9] models of electrical networks 

Using Kirchhoff s laws (see Example 2), find the currents. 
(Show the details of your work.) 



Wheatstone bridge Net of one-way streets 

(Prob. 20, next page) (Prob. 21, next page) 
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20. (Wheatstone bridge) Show that if R x /R 3 = Ri/R 2 in 
the figure, then 7 = 0. ( R 0 is the resistance of the 
instrument by which I is measured.) This bridge is a 
method for determining R x . R x , /? 2 , R2 are known. R 3 
is variable. To get R x , make / = 0 by varing R 3 . Then 
calculate R x = R 3 R 1 /R 2 . 

21. (Traffic flow) Methods of electrical circuit analysis 
have applications to other fields. For instance, applying 
the analog of Kirchhoff s current law, find the traffic 
flow (cars per hour) in the net of one-way streets (in 
the directions indicated by the arrows) shown in the 
figure. Is the solution unique? 

22. (Models of markets) Determine the equilibrium 
solution ( Dj = Si, D 2 = S 2 ) of the two-commodity 
market with linear model (£), S, P = demand, supply, 
price; index 1 = first commodity, index 2 = second 
commodity) 

D x = 60 - 2 P x - P 2 , S t = 4 P x - 2 P 2 + 14 

D 2 = 4 P x - P 2 4 10, S 2 = 5P 2 - 2. 

23. (Equivalence relation) By definition, an equivalence 
relation on a set is a relation satisfying three conditions 
(named as indicated): 

(i) Each element A of the set is equivalent to itself 
(" Reflexivity ”). 

(ii) If A is equivalent to B y then B is equivalent to A 
( Symmetry ”). 

(iii) If A is equivalent to B and B is equivalent to C, 
then A is equivalent to C {“Transitivity"). 

Show that row equivalence of matrices satisfies these 
three conditions. Hint. Show that for each of the three 
elementary row operations these conditions hold. 

24. PROJECT. Elementary Matrices. The idea is that 
elementary operations can be accomplished by matrix 
multiplication. If A is an m X n matrix on which we 
want to do an elementary operation, then there is a 
matrix E such that EA is the new matrix after the 
operation. Such an E is called an elementary matrix. 
This idea can be helpful, for instance, in the design of 
algorithms. ( Computationally , it is generally preferable 


to do row operations directly , rather than by 
multiplication by E.) 

(a) Show that the following are elementary matrices, 
for interchanging Rows 2 and 3, for adding —5 times 
the first row to the third, and for multiplying the fourth 
row by 8. 

0 0 0 " 

0 1 0 

1 0 0 

0 0 1 . 

0 0 0 " 

1 0 0 

0 1 0 

0 0 1 . 

0 0 0 “ 

1 0 0 

0 1 0 

0 0 8. 

Apply Ej, E 2 , E 3 to a vector and to a 4 X 3 matrix of 
your choice. Find B = E 3 E 2 E!A, where A = [a jk ] is 
the general 4X2 matrix. Is B equal to C = E^E-jA? 

(b) Conclude that E x , E* 2 , E 3 are obtained by doing 
the corresponding elementary operations on the 4 X 4 
unit matrix. Prove that if M is obtained from A by an 
elementary row operation , then 

M = EA, 

where E is obtained from the n X n unit matrix l n by 
the same row operation. 

25. CAS PROJECT. Gauss Elimination and Back 
Substitution. Write a program for Gauss elimination 
and back substitution (a) that does not include pivoting, 
(b) that does include pivoting. Apply the programs to 
Probs. 1 3-1 6 and to some larger systems of your choice. 


Ei = 


E 2 = 


E* = 


1 

0 

0 

0 

1 

0 

-5 

0 

1 

0 

0 

0 


7.4 Linear Independence. Rank of a Matrix. 
Vector Space 

In the last section we explained the Gauss elimination with back substitution, the most 
important numeric solution method for linear systems of equations. It appeared that such 
a system may have a unique solution or infinitely many solutions, or it may be inconsistent, 
that is, have no solution at all, Hence we are confronted with the questions of existence 
and uniqueness of solutions. We shall answer these questions in the next section. As the 
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EXAMPLE 1 


DEFINITION 


key concept for this (and other questions) we introduce the rank of a matrix . To define 
rank, we first need the following concepts, which are of general importance. 

Linear Independence and Dependence of Vectors 

Given any set of m vectors a (1) , • • • , a (m) (with the same number of components), a linear 
combination of these vectors is an expression of the form 


c l a <l) + c 2 a (2) "b • * • + C 7u a (7n) 


where c l9 c 2 , * • • , c m are any scalars. Now consider the equation 
(1) + c 2 a (2) + • • • + c m a Cm) = 0. 

Clearly, this vector equation (1) holds if we choose all c/s zero, because then it becomes 
0 = 0. If this is the only m-tuple of scalars for which (1) holds, then our vectors 
a a) , • • • , a (m> are said to form a linearly independent set or, more briefly, we call them 
linearly independent. Otherwise, if (1) also holds with scalars not all zero, we call these 
vectors linearly dependent, because then we can express (at least) one of them as a 
linear combination of the others. For instance, if (1) holds with, say, c\ ± 0, we can 
solve (1) for a (1) : 

3(1) = * 23(2) + • • • + *m3(m> where kj = -Cj/Cj. 

(Some k/s may be zero. Or even all of them, namely, if a (1) = 0.) 

Why is this important? Well, in the case of linear dependence we can get rid of some 
of the vectors until we arrive at a linearly independent set that is optimal to work with 
because it is smallest possible in the sense that it consists only of the “really essential” 
vectors, which can no longer be expressed linearly in terms of each other. This motivates 
the idea of a “basis” used in various contexts, notably later in our present section. 

Linear Independence and Dependence 

The three vectors 

a ( i) = [ 3 0 2 2] 

3(2) = [ “6 42 24 54] 

3(3) = 121 -21 0 -15] 


are linearly dependent because 


6a (l) “ 2 a <2> “ a (3) - 0- 


Although this is easily checked (do it!), it is not so easy to discover. However, a systematic method for finding 
out about linear independence and dependence follows below. 

The first two of the three vectors are linearly independent because c^iy + c 2 a (2) = 0 implies c 2 = 0 (from 
the second components) and then c x = 0 (from any other component of a a) ). B 

Rank of a Matrix 


The rank of a matrix A is the maximum number of linearly independent row vectors 
of A. It is denoted by rank A. 
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Our further discussion will show that the rank of a matrix is an important key concept for 
understanding general properties of matrices and linear systems of equations. 


EXAMPLE 2 Rank 

The matrix 


( 2 ) 



" 3 

0 

2 

2' 

A = 

-6 

42 

24 

54 


. 21 

-21 

0 

“15. 


has rank 2, because Example 1 shows that the first two row vectors are linearly independent, whereas all three 
row vectors are linearly dependent. 

Note further that rank A = 0 if and only if A = 0. This follows directly from the definition. I 


We call a matrix A! row-equivalent to a matrix A 2 if A x can be obtained from A 2 by 
(finitely many!) elementary row operations. 

Now the maximum number of linearly independent row vectors of a matrix does not 
change if we change the order of rows or multiply a row by an nonzero c or take a linear 
combination by adding a multiple of a row to another row. This proves that rank is 
invariant under elementary row operations: 


THEOREM 1 


Row-Equivalent Matrices 

Row-equivalent matrices have the same rank. 


Hence we can determine the rank of a matrix by reduction to row-echelon form 
(Sec. 7.3) and then see the rank directly. 

EXAMPLE 3 Determination of Rank 

For the matrix in Example 2 we obtain successively 


’ 3 

0 

2 

2“ 


-6 

42 

24 

54 

(given) 

. 21 

-21 

0 

-15. 


" 3 

0 

2 

2 ~ 


0 

42 

28 

58 

Row 2 + 2 Row 1 

0 

-21 

-14 

-29_ 

Row 3-7 Row 1 

" 3 

0 

2 

2“ 


0 

42 

28 

58 


_ 0 

0 

0 

0_ 

Row 3+5 Row 2 


Since rank is defined in terms of two vectors, we immediately have the useful 


THEOREM 2 


Linear Independence and Dependence of Vectors 

p vectors with n components each are linearly independent if the matrix with these 
vectors as row vectors has rank p> but they are linearly dependent if that rank is 
less than p . 
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THEOREM 3 


PROOF 


EXAMPLE 4 


Further important properties will result from the basic 


Rank in Terms of Column Vectors 

The rank r of a matrix A equals the maximum number of linearly independent 
column vectors of A. 

Hence A and its transpose A T have the same rank. 


In this proof we write simply “rows” and “columns” for row and column vectors. Let 
A be an m X n matrix of rank A = r. Then by definition of rank, A has r linearly 
independent rows which we denote by v (1> , • • • # v (r) (regardless of their position in A), 
and all the rows a (1) , • • • , a (m) of A are linear combinations of those, say, 

a (l) = c ll v (l) “b c 12 v (2) + * ' * + c lr v (r) 


( 3 ) 


a (2) ~~ C 21 V (1) + 6 22 v (2) + * ‘ + C 2r V (r) 


a (m) c ?ul v Cl) “b c m2^(Z) * "b ^mr v (r). 


These are vector equations for rows. To switch to columns, we write (3) in terms of 
components as n such systems, with k = 


a l k - c ll v lk + c l2 v 2k + • • • + C lr u rfc 
^2 k c 21^1k “b ^22^2 k "b * * * “b C^r^rk 


&mk CmlVlk “b ^m2^2k "b * ‘ * "b C rnr V r j c 


and collect components in columns. Indeed, we can write (4) as 


( 5 ) 


a lk 


C 11 


c 12 


c lr 

a 2k 

= v lk 

... £ 
M 

+ V 2k 

9 ••• 

+ • • • + v rk 

^2 r 

J-l mk _ 




_^m2_ 




where k — Now the vector on the left is the £th column vector of A. We see 

that each of these n columns is a linear combination of the same r columns on the right. 
Hence A cannot have more linearly independent columns than rows, whose number is 
rank A = /*. Now rows of A are columns of the transpose A T . For A T our conclusion is 
that A T cannot have more linearly independent columns than rows, so that A cannot have 
more linearly independent rows than columns. Together, the number of linearly 
independent columns of A must be r, the rank of A. This completes the proof. ■ 


Illustration of Theorem 3 

The matrix in (2) has rank 2. From Example 3 we see that the first two row vectors are linearly independent 
and by “working backward” we can verify that Row 3 = 6 Row I — \ Row 2. Similarly, the first two columns 
are linearly independent, and by reducing the last matrix in Example 3 by columns we find that 

Column 3 = | Column 1 + f Column 2 and Column 4 = § Column 1 + f? Column 2. ■ 
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THEOREM 4 


PROOF 


EXAMPLE 5 


THEOREM 5 


PROOF 


CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems 
Combining Theorems 2 and 3 we obtain 


Linear Dependence of Vectors 

p vectors with n < p components are always linearly dependent. 


The matrix A with those p vectors as row vectors has p rows and n < p columns; hence by 
Theorem 3 it has rank A ^ n < p , which implies linear dependence by Theorem 2. ■ 

Vector Space 

The following related concepts are of general interest in linear algebra. In the present 
context they provide a clarification of essential properties of matrices and their role in 
connection with linear systems. 

A vector space is a (nonempty) set V of vectors such that with any two vectors a and 
b in V all their linear combinations <xa ■+■ /3b (a, ft any real numbers) are elements of V, 
and these vectors satisfy the laws (3) and (4) in Sec. 7.1 (written in lowercase letters a, 
b, u, • • • , which is our notation for vectors). (This definition is presently sufficient. 
General vector spaces will be discussed in Sec. 7.9.) 

The maximum number of linearly independent vectors in V is called the dimension of 
V and is denoted by dim V. Here we assume the dimension to be finite; infinite dimension 
will be defined in Sec. 7.9. 

A linearly independent set in V consisting of a maximum possible number of vectors 
in V is called a basis for V. Thus the number of vectors of a basis for V equals dim V. 

The set of all linear combinations of given vectors a cl) , • • • , a (p) with the same 
number of components is called the span of these vectors. Obviously, a span is a vector 
space. 

By a subspace of a vector space V we mean a nonempty subset of V (including V itself) 
that forms itself a vector space with respect to the two algebraic operations (addition and 
scalar multiplication) defined for the vectors of V. 

Vector Space, Dimension, Basis 

The span of the three vectors in Example I is a vector space of dimension 2, and a basis is a (1) , a (2 >, for instance, 
or aQ). a <3) , etc. ■ 

We further note the simple 


Vector Space R" 

The vector space R n consisting of all vectors with n components (n real numbers) 
has dimension n. 


A basis of n vectors is a (1) = [1 0 • • • 0], a^, = [0 I 0 • • • 0], • * • , 

acn, = 10 0 1], ■ 


In the case of a matrix A we call the span of the row vectors the row space of A and the 
span of the column vectors the column space of A. 
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Now, Theorem 3 shows that a matrix A has as many linearly independent rows as 
columns. By the definition of dimension, their number is the dimension of the row space 
or the column space of A. This proves 


THEOREM 6 


Row Space and Column Space 

The row space and the column space of a matrix A have the same dimension , equal 
to rank A. 


Finally, for a given matrix A the solution set of the homogeneous system Ax = 0 is a 
vector space, called the null space of A, and its dimension is called the nullity of A. In 
the next section we motivate and prove the basic relation 

(6) rank A + nullity A = Number of columns of A. 
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1 1-12 1 RANK, ROW SPACE, COLUMN SPACE 

Find the rank and a basis for the row space and for the 
column space. Hint. Row-reduce the matrix and its 
transpose. (You may omit obvious factors from the vectors 
of these bases.) 



" 1 

-2“ 




" 8 

2 

5“ 

1. 

0 

0 



2. 

16 

6 

29 


.-3 

6_ 




. 4 

0 

—7. 


"0 

-2 

1 

3“ 


~ a 

b 

c~ 


3. 

1 

4 

0 

7 

4. 

_b 

a 

c_ 



.5 

5 

5 

5. 







“ 0 

3 

4“ 



“1 

1 

a 


5. 

-3 

0 

-5 


6. 

1 

a 

1 



4 

5 

0. 



_a 

1 

1. 




11 . 


16 

4 


2 3 4“ 

3 4 5 

4 5 6 

5 6 7. 

4 8 16" 

8 4 2 

8 16 2 



16 8 4. 


0 —7 r 

0 5 0 

5 0 2 

0 2 0 . 



13-20 


LINEAR INDEPENDENCE 


Are the following sets of vectors linearly independent? 
(Show the details.) 


13. [3 -2 0 4], [5 0 0 1], [-6 1 0 1], 

[2 0 0 3] 

14. [1 1 0], [I 0 0], [ 1 1 1] 

15. [6 0 3 1 4 2], [0 -1 2 7 0 5], 

[12 3 0 -19 8 -11] 

16. [3 4 7], [2 0 3], [8 2 3], [5 5 6] 


17. [0.2 1.2 5.3 2.8 1.6], 

[4.3 3.4 0.9 2.0 -4.3] 
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18. [3 

19. [l 

ft 

20. [1 


2 1], [0 0 0], [4 3 6] 

1 i il fi l i ii fi i i 

2 3 4 }' [2 3 4 5J* 13 4 5 6> 

1 A 11 

5 6 7J 

2 3 4], [2 3 4 5]. [3 4 5 6], 


[4 5 6 7] 


25. If the row vectors of a square matrix are linearly 
independent, so are the column vectors, and 
conversely. 

26. Give examples showing that the rank of a product of 
matrices cannot exceed the rank of either factor. 


21. CAS Experiment. Rank, (a) Show experimentally 
that the n X n matrix A = \aj k ] with aj k = j + k — I 
has rank 2 for any n. (Problem 20 shows n = 4.) Try 
to prove it. 

(b) Do the same when a jk = j + k + c. where c is 
any positive integer. 

(c) What is rank A if a$ k = 2 J I ,C ~ 2 ? Try to find other 
large matrices of low rank independent of n. 

22-26 1 PROPERTIES OF RANK 

AND CONSEQUENCES 

Show the following. 

22. rank B T A T = rank AB. (Note the order!) 

23. rank A = rank B does not imply rank A 2 = rank B 2 . 
(Give a counterexample.) 

24. If A is not square, either the row vectors or the column 
vectors of A are linearly dependent. 


27-36 1 VECTOR SPACES 

Is the given set of vectors a vector space? (Give reason.) If 
your answer is yes, determine the dimension and find a 
basis. (v t , v 2 y * * ■ denote components.) 

27. All vectors in R 3 such that + v 2 = 0 

28. All vectors in R 4 such that 2v 2 - 3u 4 = k 

29. All vectors in R 3 with u, ^ 0, u 2 = — 4v 3 

30. All vectors in R 2 with u j ^ u 2 

31. All vectors in R 3 with 4v\ + v 3 = 0, 3v 2 = v 3 

32. All vectors in R 4 with v x — v 2 = 0, v 3 = 5v^ v 4 = 0 

33. All vectors in R n with |yj| ^ I for j = I 

34. All ordered quadruples of positive real numbers 

35. All vectors in R 5 with - 2v 2 = 3t; 3 = 4u 4 = 5u 5 

36. All vectors in R 4 with 

3ui - u 3 = 0. 2v l 4- 3v 2 ~ 4u 4 = 0 


7.5 Solutions of Linear Systems: 

Existence, Uniqueness 

Rank as just defined gives complete information about existence, uniqueness, and general 
structure of the solution set of linear systems as follows. 

A linear system of equations in n unknowns has a unique solution if the coefficient matrix 
and the augmented matrix have the same rank n, and infinitely many solution if that common 
rank is less than n. The system has no solution if those two matrices have different rank. 

To state this precisely and prove it, we shall use the (generally important) concept of 
a submatrix of A. By this we mean any matrix obtained from A by omitting some rows 
or columns (or both). By definition this includes A itself (as the matrix obtained by omitting 
no rows or columns): this is practical. 

THEOREM 1 Fundamental Theorem for Linear Systems 

(a) Existence. A linear system ofm equations in n unknowns x v • • • , x n 

fln-Vi + Cl 12*2 + ■ ■ ■ + a ln x n = b x 
+ a 2 2*2 + • • • + a 2n x n = b 2 

( 1 ) 


"ml*! Cl 2 “1“ ’ * * *1" Cl mn X n b m 
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is consistent, that is, has solutions, if and only if the coefficient matrix A and the 
augmented matrix A have the same rank . Here, 


"«1X 


a ln 


’*11 


din 

1 

1 

b r 

. 

. . . 

. 

and A = 

. 

... 

. 

K 

i 

i 

a 

, 

_ a ml 


^mn. 


_ a ml 


&mn 

l 

i 

b m _ 


(b) Uniqueness. The system (I) has precisely one solution if and only if this 
common rank r of A and A equals n. 

(c) Infinitely many solutions. If this common rank r is less than n, the system 
(1) has infinitely many solutions . All of these solutions are obtained by detetmining 
r suitable unknowns {whose submatrix of coefficients must have rank r) in terms of 
the remaining n — r unknowns, to which arbitrary values can be assigned. (See 
Example 3 in Sec. 7.3.) 

(d) Gauss elimination (Sec. 7.3). If solutions exist , they can all be obtained by 
the Gauss elimination. (This method will automatically reveal whether or not 
solutions exist; see Sec. 7.3.) 


PROOF (a) We can write the system (1) in vector form Ax = b or in terms of column vectors 

C (l)» ’ ’ ' > c (n) A: 

(2) Ca)*! + c (2) a* 2 + • • • + c (n) A n = b. 

A is obtained by augmenting A by a single column b. Hence, by Theorem 3 in Sec. 7.4, 
rank A equals rank A or rank A + 1. Now if (1) has a solution x, then (2) shows that b 
must be a linear combination of those column vectors, so that A and A have the same 
maximum number of linearly independent column vectors and thus the same rank. 

Conversely, if rank A = rank A, then b must be a linear combination of the column 
vectors of A, say, 

(2*) b = £*!<:<!) + • • • + a„c (n) 

since otherwise rank A = rank A + 1. But (2*) means that (1) has a solution, namely, 
= a 1? • * • , x n = a n , as can be seen by comparing (2*) and (2). 

(b) If rank A = n, the n column vectors in (2) are linearly independent by Theorem 3 
in Sec. 7.4. We claim that then the representation (2) of b is unique because otherwise 

C U)*1 + * * * + C (n) X n = Cq)?! + • • • + C (n) X n . 

This would imply (take all terms to the left, with a minus sign) 


(a x - *i)c (1> +••• + (**- A n )c <n) = 0 

and x 1 — Xi = 0, • • • , x n — x n = 0 by linear independence. But this means that the 
scalars x l9 • • • , x n in (2) are uniquely determined, that is, the solution of (1) is unique. 
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(c) If rank A = rank A = r < n> then by Theorem 3 in Sec. 7.4 there is a linearly 
independent set K of r column vectors of A such that the other n — r column vectors of 
A are linear combinations of those vectors. We renumber the columns and unknowns, 
denoting the renumbered quantities by ", so that {c (1) , • • • , c (r) } is that linearly independent 
set K. Then (2) becomes 


C(1>*1 + * * * + C (r) J c r + C (r+1) X r+1 + • • • + C (n) x n = b, 

£(r+i)> • • • » C(n) are linear combinations of the vectors of K y and so are the vectors 
Jr r+ iC (r+ i), • • • , A* n c (n) . Expressing these vectors in terms of the vectors of K and collecting 
terms, we can thus write the system in the form 

(3) c a) .V! + • • • + c ( , .)>•,. = b 

with yj = Xj + fa where results from the n — r terms c (r+1) A* r+ i, * * , c (n) jc n ; here, 
j = 1, • • • , r. Since the system has a solution, there are y l9 • • • , y r satisfying (3). These 
scalars are unique since K is linearly independent. Choosing Jt* r +i, • * * ,Jt n fixes the fy 
and corresponding Xj = - fy, where j = !,**•,/*. 

(d) This was discussed in Sec. 7.3 and is restated here as a reminder. ■ 

The theorem is illustrated in Sec. 7.3. In Example 2 there is a unique solution since 
rank A = rank A = n = 3 (as can be seen from the last matrix in the example). In Example 

3 we have rank A = rank A = 2 < n = 4 and can choose x 3 and x 4 arbitrarily . In Example 

4 there is no solution because rank A = 2 < rank A = 3. 

Homogeneous Linear System 

Recall from Sec. 7.3 that a linear system (1) is called homogeneous if all the b/s are 
zero, and nonhomogeneous if one or several b/s are not zero. For the homogeneous 
system we obtain from the Fundamental Theorem the following results. 


Homogeneous Linear System 

A homogeneous linear system 


«llA'i + a 12*2 + • 

“1" &ln x n 0 

a 21*l ^22*2 "b ‘ 

(4) 

• • + a 2n x n = 0 

a ml x l "h a m2 x 2 + ‘ 

4” t.i mn x n 0 

always has the trivial solution x x = 0, • • • , x n = 0. Nontrivial solutions exist if and 
only if rank A < n. If rank A — r < n, these solutions, together with x = 0 y form a 
vector space (see Sec. 7.4) of dimension n — r, called the solution space of (4). 

In particular ; f/x (1> and x (2 > are solution vectors of { 4), then x = CxXqj + c 2 x (2) 
with any scalars c 1 and c 2 is a solution vector of (4). (This does not hold for 
nonhomogeneous systems. Also, the term solution space is used for homogeneous 
systems only.) 
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PROOF The first proposition can be seen directly from the system. It agrees with the fact that 
b = 0 implies that rank A = rank A, so that a homogeneous system is always consistent 
If rank A = rc, the trivial solution is the unique solution according to (b) in Theorem 1. 
If rank A < rc, there are nontrivial solutions according to (c) in Theorem 1. The solutions 
form a vector space because if x (1) and x (2) are any of them, then Ax <1) = 0, Ax (2) = 0, 
and this implies A(x a) + x (2) ) = Ax a) + Ax (2> = 0 as well as A(cx (1) ) = cAx (1) = 0, 
where c is arbitrary. If rank A = r < n. Theorem 1 (c) implies that we can choose 
n — r suitable unknowns, call them x r+1> • * • , x n , in an arbitrary fashion, and every 
solution is obtained in this way. Hence a basis for the solution space, briefly called a basis 
of solutions of (4), is y (1) , • • • , y (n _ r) , where the basis vector y (j) is obtained by choosing 
x r +j = 1 and the other x r+ll • • • , x n zero; the corresponding first r components of this 
solution vector are then determined. Thus the solution space of (4) has dimension n — r. 
This proves Theorem 2. ■ 

The solution space of (4) is also called the null space of A because Ax = 0 for every x 
in the solution space of (4). Its dimension is called the nullity of A. Hence Theorem 2 
states that 

(5) rank A + nullity A = n 

where n is the number of unknowns (number of columns of A). 

Furthermore, by the definition of rank we have rank A ^ m in (4). Hence if m < n 9 
then lank A < n. By Theorem 2 this gives the practically important 


THEOREM 3 


Homogeneous Linear System with Fewer Equations Than Unknowns 

A homogeneous linear system with fewer equations than unknowns has always 
nontrivial solutions. 


Nonhomogeneous Linear Systems 

The characterization of all solutions of the linear system (1 ) is now quite simple, as follows. 


THEOREM 4 


Nonhomogeneous Linear System 

If a nonhomogeneous linear system (1) is consistent , then all of its solutions are 
obtained as 

(6) x = x 0 + x h 

where x 0 is any (fixed) solution of ( 1 ) and x h runs through all the solutions of the 
corresponding homogeneous system (4). 


PROOF The difference x h = x - x 0 of any two solutions of (1) is a solution of (4) because 
Ax h = A(x - x 0 ) = Ax - Ax 0 = b - b = 0. Since x is any solution of (1), we get all 
the solutions of (1) if in (6) we take any solution x 0 of (1) and let x h vary throughout the 
solution space of (4). ■ 
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7.6 For Reference: 

Second- and Third-Order Determinants 

We explain these determinants separately from the general theory in Sec. 7.7 because they 
will be sufficient for many of our examples and problems. Since this section is for 
reference, go on to the next section, consulting this material only when needed. 

A determinant of second order is denoted and defined by 


( 1 ) 


D = det A = 


flu 


a 21 


a 12 

a 22 


— a lx a 22 Cl i 2 a 2 i- 


So here we have bars (whereas a matrix has brackets). 

Cramer’s rule for solving linear systems of two equations in two unknowns 


(a) flu*! + a 12 x 2 = b x 


(2) 

(b) a 21 xx + a 2Z x 2 

— t>2 

is 


b\ a 12 





b 2 o 22 

bx a 22 

~ a 12 b 2 


A *1 “ 

D 


D 

(3) 


«n b x 




= - 

a 2i b 2 

_ anb 2 

~ b x a 2 x 


D D 


with D as in (1), provided 

D* 0 . 

The value D — 0 appears for inconsistent nonhomogeneous systems and for homogeneous 
systems with nontrivial solutions. 

PROOF We prove (3). To eliminate x 2 , multiply (2a) by a 22 and (2b) by — a 12 and add. 


(^ 11^22 — a l2 cl 2l) x l ~ b X Cl 22 — a^2,b 2 . 


Similarly, to eliminate x x , multiply (2a) by — a 2i and (2b) by a u and add, 


(#11^22 ci X2 a 2 i)x 2 — cinb 2 — b x a 2 i. 


Assuming that D = aua 2 2 - a r2 a 2 x ^ 0, dividing, and writing the right sides of these 
two equations as determinants, we obtain (3). ■ 
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EXAMPLE 1 


Cramer’s Rule for Two Equations 




12 

3 



4 

12 


4xi + 3*2 = 12 

then *x = * 

-8 

5 

= *i=6, 

14 

*2 — 

2 

-8 

-56 

2* x + 5*2 = — 8 

4 

3 

4 

3 

14 



2 

5 



2 

5 



Third-Order Determinants 

A determinant of third order can be defined by 




flu 

^12 

a 13 







(4) 

D = 





^22 

a 23 


a 12 

a 13 


a 12 

a 13 

a 21 

a 22 

a 23 

— dn 



a 21 



+ a 31 









a 32 

a 33 


a 32 

a 33 


a 22 

a 23 



^31 

a 32 

a 33 

i 







Note the following. The signs on the right are H h Each of the three terms on the 

right is an entry in the first column of D times its minor, that is, the second-order 
determinant obtained from D by deleting the row and column of that entry; thus, for a Y1 
delete the first row and first column, and so on. 

If we write out the minors in (4), we obtain 

(4*) D = dud 22 a 33 ^11^23^32 + a 21 a 13 a 32 ~~ a 2l a l2 a 33 a 31 a 12 a 23 a 31 a 13 a 22‘ 


Cramer's Rule for Linear Systems of Three Equations 




+ 

a 12 x 2 



(5) 

&2\ x l 

+ 

d22 x 2 

+ a 23 x Z 

= b 2 


a 3\ x l 

+ 

d$2 x 2 

“t a 33 x 3 

= b z 

is 






(6) 

Oi 



d 2 


d - * 


X 2 = 

D 5 

*3 = 


(P* 0 ) 


with the determindnt D of the system given by (4) and 



h 

a l2 

a 13 


#11 

b 1 

#13 


flu 

#12 

b 1 

01 = 

l>2 

a 22 

#23 

5 D 2 ~ 

a 2 1 

b 2 

#23 

> ^3 — 

a 21 

#22 

b 2 


^3 

^32 

#33 


<*31 

b 3 

#33 


a 3l 

#32 

b 3 


Note that D x , Z> 2 , Z) 3 are obtained by replacing Columns 1, 2, 3, respectively, by the 
column of the right sides of (5). 

Cramer’s rule (6) can be derived by eliminations similar to those for (3), but it also 
follows from the general case (Theorem 4) in the next section. 
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7.1 Determinants. Cramer’s Rule 

Determinants were originally introduced for solving linear systems. Although impractical 
in computations, they have important engineering applications in eigenvalue problems 
(Sec. 8.1), differential equations, vector algebra (Sec. 9.3), and so on. They can be 
introduced in several equivalent ways. Our definition is particularly practical in connection 
with linear systems. 

A determinant of order n is a scalar associated with an nX n (hence squarel) matrix 
A = which is written 


(1) D = detA = 


011 

012 


01n 

021 

1 ‘ 

. . . 

02 n 

0nl 

0n2 


0nn 


and is defined for n = 1 by 

(2) D = a u 

and for 2 by 

(3a) D 0ji^ji ”1” 0j2 Q 2 “P * * * “P QjnCjn 0 1> 2, * * * , or n) 

or 

(3b) D = a lk C lk + a 2k C 2k + • • • + a nk C nk (k = 1, 2, • • • , or n) 

Here, 

c jk = (~iy +k M jk 

and M jk is a determinant of order n — 1, namely, the determinant of the submatrix of A 
obtained from A by omitting the row and column of the entry Oj k , that is, the jth row and 
the Mi column. 

In this way, D is defined in terms of n determinants of order n — l, each of which is, 
in turn, defined in terms of n — 1 determinants of order n — 2, and so on; we finally 
arrive at second-order determinants, in which those submatrices consist of single entries 
whose determinant is defined to be the entry itself. 

From the definition it follows that we may expand D by any row or column , that is, 
choose in (3) the entries in any row or column, similarly when expanding the C jk s in (3), 
and so on. 

This definition is unambiguous, that is, yields the same value for D no matter which 
columns or rows we choose in expanding. A proof is given in App. 4. 
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Terms used in connection with determinants are taken from matrices. In D we have n 2 
entries a jk , also n rows and n columns, and a main diagonal on which a llf a 2 2 , • • • , a nn 
stand. Two terms are new: 

Mj k is called the minor of a jk in D, and Cjk the cofactor of Oj k in D. 

For later use we note that (3) may also be written in terms of minors 

n 

D = 2 (~iy +k a jk M jk (j = 1, 2, • • • , or n) 

k= 1 

n 

0 = 2 (-l) jf Vik (* = 1, 2, • • • , or n). 

J-l 

EXAMPLE 1 Minors and Cofactors of a Third-Order Determinant 

In ( 4 ) of the previous section the minors and cofactors of the entries in the first column can be seen directly. 
For the entries in the second row the minors are 

a \2 ^13 ffll a l3 a ll a l2 

— » A ^22 “ t M 23 — 

a Z2 a ZZ a Z\ a ZZ a Zl a Z2 

and the cofactors are C 2 1 = C 22 ~ +M 2 2 < and C23 = -M23. Similarly for the third row — write these 

down yourself. And verify that the signs in C jk form a checkerboard pattern 

+ - + 

- + - 

+ - + ■ 


(4a) 

(4b) 


EXAMPLE 2 Expansions of a Third-Order Determinant 



1 

3 

0 






6 

4 2 

4 2 6 

D = 

2 

6 

4 = 1 

- 3 

+ 0 




0 

2 -1 

2 -10 


-l 

0 

2 




= 1(12 - 0 ) - 3(4 + 4 ) + 0(0 + 6) = - 12 . 


This is the expansion by the first row. The expansion by the third column is 


2 

6 1 

3 1 

3 

D = 0 

-4 

+ 2 

= 0 - 12 4-0 = — 12 , 

-1 

0 -1 

0 2 

6 


Verify that the other four expansions also give the value —12. ■ 

EXAMPLE 3 Determinant of a Triangular Matrix 


-3 

0 

0 

4 

0 

6 

4 

0 

= -3 

= - 3 * 4*5 = - 60 . 




2 

5 

-1 

2 

5 




Inspired by this, can you formulate a little theorem on determinants of triangular matrices? Of diagonal 
matrices? ■ 
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THEOREM 1 


PROOF 


EXAMPLE 4 


General Properties of Determinants 

To obtain the value of a determinant (1), we can first simplify it systematically by 
elementary row operations, similar to those for matrices in Sec. 7.3, as follows. 


Behavior of an nth-Order Determinant under Elementary Row Operations 

(a) Interchange of two rows multiplies the value of the determinant by — 1 . 

(b) Addition of a multiple of a row to another row does not alter the value of the 
determinant. 

(c) Multiplication of a row by a nonzero constant c multiplies the value of the 
determinant by c. (This holds also when c = 0, but gives no longer an elementary 
row operation.) 


(a) By induction. The statement holds for n = 2 because 


a 

c 


= ad — be , 


but 


c 


a 


d 

b 


= be — ad. 


We now make the induction hypothesis that (a) holds for determinants of order n — 1^2 
and show that it then holds for determinants of order n. Let D be of order n. Let E be 
obtained from D by the interchange of two rows. Expand D and E by a row that is not 
one of those interchanged, call it the yth row. Then by (4a), 

n n 

( 5 ) 0 = 2 (-1 y +k a Jk M jk , £ = 2 (-1 y +k a jk N jk 

k = 1 k = 1 

where N jk is obtained from the minor M jk of a jk in D by the interchange of those two 
rows which have been interchanged in D (and which N jk must both contain because we 
expand by another row!). Now these minors are of order n — 1. Hence the induction 
hypothesis applies and gives Nj k = — M jk . Thus E = —D by ( 5 ). 

(b) Add c times Row i to Row j. Let D be the new determinant. Its entries in Row j are 
aj k + ca ik . If we expand D by this Row j, we see that we can write it as D = D x + c£> 2 , 
where D x = D has in Row j the a jk , whereas D 2 has in that Row j the a ik from the addition. 
Hence D 2 has a ik in both Row i and Row j. Interchanging these two rows gives D 2 back, 
but on the other hand it gives - D 2 by (a). Together D 2 = -Z) 2 = 0, so that D = D 1 = D. 

(c) Expand the determinant by the row that has been multiplied. 


CAUTION! det (cA) = c n det A (not c det A). Explain why. 


Evaluation of Determinants by Reduction to Triangular Form 


Because of Theorem 1 we may evaluate determinants by reduction to triangular form, as in the Gauss elimination 
for a matrix. For instance (with the blue explanations always referring to the preceding determinant ) 


D = 


2 

4 

0 

-3 


0-4 6 

5 I 0 

2 6-1 
8 9 I 
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THEOREM 2 


PROOF 


THEOREM 3 


2 0-46 

0 5 9 -12 

0 2 6 -1 

0 8 3 10 


Row 2-2 Row 1 


Row 4+1.5 Row I 



2 

0 

-4 

6 




0 

5 

9 

-12 




0 

0 

2.4 

3.8 

Row 3 

- 0.4 Row 2 


0 

0 

-11.4 

29.2 

Row 4 

- 1 .6 Row 2 


2 0-46 

0 5 9 -12 


0 

0 


0 2.4 3.8 

0 -0 47.25 


Row 4 + 4.75 Row 3 


= 2 • 5 • 2.4 • 47.25 = 1134. 


Further Properties of nth-Order Determinants 

(a)-(c) in Theorem 1 hold also for columns. 

(d) Transposition leaves the value of a determinant unaltered. 

(e) A zero row or column renders the value of a determinant zero. 

(f) Proportional rows or columns render the value of a determinant zero. In 
particular , a determinant with two identical rows or columns has the value zero. 


(a)-(e) follow directly from the fact that a determinant can be expanded by any row 
column. In (d), transposition is defined as for matrices, that is, the yth row becomes the 
jth column of the transpose. 

(f) If Row j = c times Row i, then D = cD l9 where D x has Row j = Row i. Hence an 
interchange of these rows reproduces D x , but it also gives — D x by Theorem 1(a). Hence 
D x = 0 and D = cD x = 0. Similarly for columns. ■ 

It is quite remarkable that the important concept of the rank of a matrix A, which is the 
maximum number of linearly independent row or column vectors of A (see Sec. 7.4), can 
be related to determinants. Here we may assume that rank A > 0 because the only matrices 
with rank 0 are the zero matrices (see Sec. 7.4). 


Rank in Terms of Determinants 

An m X n matrix A = [aj k ] has rank r = 1 if and only if A has an r X r submatrix 
with nonzero determinant , whereas every square submatrix with more than r rows 
that A has (or does not have!) has determinant equal to zero. 

In particular, if A is square, n X n, it has rank n if and only if 

det A^O. 
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PROOF 


THEOREM 4 


The key idea is that elementary row operations (Sec. 7.3) alter neither rank (by Theorem 
1 in Sec. 7.4) nor the property of a determinant being nonzero (by Theorem 1 in this 
section). The echelon form A of A (see Sec. 7.3) has r nonzero row vectors (which are 
the first r row vectors) if and only if rank A = r. Let R be the r X /• submatrix in the left 
upper corner of A (so that the entries of R are in both the first r rows and /* columns of A). 
Now R is triangular, with all diagonal entries tjj nonzero. Thus, det R = r n * * • r rr ¥= 0. 
Also det R ^ 0 for the corresponding r X r submatrix R of A because R results from R 
by elementary row operations. Similarly, det S = 0 for any square submatrix S of /* -1- 1 
or more rows perhaps contained in A because the corresponding submatrix § of A must 
contain a row of zeros (otherwise we would have rank A ^ r + 1), so that det § = 0 by 
Theorem 2. This proves the theorem for an m X n matrix. 

In particular, if A is square, n X /?, then rank A = n if and only if A contains an n X n 
submatrix with nonzero determinant. But the only such submatrix can be A itself, hence 
det A 0. ■ 

Cramer’s Rule 

Theorem 3 opens the way to the classical solution formula for linear systems known as 
Cramer’s rule 2 , which gives solutions as quotients of determinants. Cramer’s rule is not 
practical in computations (for which the methods in Secs. 7.3 and 20.1-20.3 are suitable), 
but is of theoretical interest in differential equations (Secs. 2.10, 3.3) and other theories 
that have engineering applications. 


Cramer’s Theorem (Solution of Linear Systems by Determinants) 

(a) If a linear system of n equations in the same number of unknowns x 1? • • • ,A* n 

rtnAi + a 12 x 2 H + o x n x n = /?! 

a 2 \X\ + a 22 x 2 + • • • + a 2n x n = b 2 

( 6 ) 


a nix 1 + a n2 x 2 + 1- a nn x n - b n 

has a nonzero coefficient determinant D = det A, the system has precisely one 
solution. This solution is given by the formulas 

D 2 

(7) A'i = — , a* 2 =—,••• , x n = — (Cramer’s rule) 

where D k is the determinant obtained from D by replacing in D the kth column by 
the column with the entries * • • ,b n . 

(b) Hence if the system (6) is homogeneous and D =£ 0, it has only the trivial 
solution x x = 0, x 2 = 0, ■ • • , x n = 0. If D = 0, the homogeneous system also has 
nontrivial solutions. 


2 GABRIEL CRAMER (1704-1752), Swiss mathematician. 
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PROOF The augmented matrix A of the system (6) is of size n X (n 4- 1). Hence its rank can be 
at most n. Now if 


flu ’ ’ " a ln 


( 8 ) 


D = det A = 


# 0 , 


* ’ ' &nn 


then rank A = n by Theorem 3. Thus rank A = rank A. Hence, by the Fundamental 
Theorem in Sec. 7.5, the system (6) has a unique solution. 

Let us now prove (7). Expanding D by its kth column, we obtain 


(9) 


^ G-lk-Cik. 4" ^2k^2k 4^ * * * 4~ Cl nk C nk , 


where C ik is the cofactor of entry a ik in D. If we replace the entries in the kth column of 
D by any other numbers, we obtain a new determinant, say, D. Clearly, its expansion by 
the &th column will be of the form (9), with a lk , , • • • , a nk replaced by those new numbers 
and the cofactors C ik as before. In particular, if we choose as new numbers the entries 
a ib * • * , a n i of the /th column of D (where l + £), we have a new determinant D which 
has twice the column [a n • • • a ni ] T , once as its /th column, and once as its fcth 
because of the replacement. Hence D = 0 by Theorem 2(f). If we now expand D by the 
column that has been replaced (the lah column), we thus obtain 

(10) c i\ t C lk 4- a 2 iC 2k + ■ - ■ + OniC nk = 0 (/ =£ k). 

We now multiply the first equation in (6) by C lk on both sides, the second by C 2k , • ■ • , 
the last by C nfc , and add the resulting equations. This gives 


(ID 


QiJc(^ii-*i 4- • • • -f cii n x n ) 4- • ■ ■ 4- C nk (ct n 4” • • • 4- (i nn A' n ) 
= b x C lk 4- • * • 4- b n C nk . 


Collecting terms with the same x j9 we can write the left side as 


Xi(&nCi k 4* a 2i^2k 4" ’ * * 4- d n \C nk ) 4- • • • 4- x n (o.i n Ci k 4- a 2n C 2k 4- • • • 4- a nn C nk ). 


From this we see that x k is multiplied by 


a i k^ik 4* a 2 k C 2k 4 4- a nk C nk . 


Equation (9) shows that this equals D. Similarly, jt t is multiplied by 


a i i^ik 4* a 2 t C 2k 4- • • • 4* ci n iC nk . 


Equation (10) shows that this is zero when / # k. Accordingly, the left side of (11) equals 
simply A'fcD, so that (11) becomes 

*kP ~ b-±C\ k 4- b 2 C 2k 4" ■ ’ * 4- b n C nk . 
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Now the right side of this is D k as defined in the theorem, expanded by its kx h column, 
so that division by D gives (7). This proves Cramer’s rule. 

If (6) is homogeneous and D =£ 0, then each D k has a column of zeros, so that D k = 0 
by Theorem 2(e), and (7) gives the trivial solution. 

Finally, if (6) is homogeneous and D = 0, then rank A < n by Theorem 3, so that 
nontrivial solutions exist by Theorem 2 in Sec. 7.5. ■ 


Illustrations of Theorem 4 for n = 2 and 3 are given in Sec. 7.6, and an important 
application of the present formulas will follow in the next section. 


PROBLEM SET 7.7 


1. (Second-order determinant) Expand a general second- 
order determinant in four possible ways and show that 
the results agree. 

2. (Minors, cofactors) Complete the list of minors and 
cofactors in Example I . 

3. (Third-order determinant) Do the task indicated in 
Example 2. Also evaluate D by reduction to triangular 
form. 

4. (Scalar multiplication) Show that det (fcA) = k n det A 
(not k det A), where A is any n X n matrix. Give an 
example. 


5-16 1 EVALUATION OF DETERMINANTS 

Evaluate, showing the details of your work. 


13 

8 

cos n 0 

sin n 0 

5. 


6. 


-2 

7 

—sin nO 

cos n 0 




14 

2 

5 

cos a 

sin a. 




7. 

8. 

2 

0 

8 

sin p 

cos p 






5 

8 

-2 

70.4 

o 

u> 

o 

bo 

2 

1 

2 

9. 0 

0.5 2.6 10. 

-2 

2 

1 

0 

0 -1.9 

1 

2 

-2 

0 

3 -11 

0 

a 

b 

11. -3 

0 

1 

1° 

—a 

0 

c 

1 

4 0 

-b 

—c 

0 


1-200 

IV 

4 3 5 0 

v 14. 

0 2 7 5 

u 

0 0 2 4 



1 

2 

0 

0 


0 

-2 

1 

0 


3 

4 

0 

0 


2 

0 

-2 

4 

15. 

0 

0 

5 

6 

16. 

-1 

2 

0 

1 


0 

0 

7 

8 


0 

-4 

-1 

0 


17. (Expansion numerically impractical) Show that the 
computation of an nth-order determinant by expansion 
involves n\ multiplications, which if a multiplication 
takes 10“ 9 sec would take these times: 


n 

10 

15 

20 

25 

Time 

0.004 

22 

77 

0.5 • 10 9 

sec 

min 

years 

years 


1 18-20 1 CRAMER’S RULE 

Solve by Cramer’s rule and check by Gauss elimination and 
back substitution. (Show details.) 


18. 2x - 5 y = 23 
4jc + 6y — —2 

19. 3y + 4z = 14.8 
4a* + 2 y — z — —6.3 

a* - y + 5z = 13.5 

20. iv + 2a* - 3z = 30 

4a* - 5y + 2z = 13 
2iv -f 8 a — 4y 4- z = 42 
3iv + y - 5z = 35 


1 21-23) RANK BY DETERMINANTS 

Find the rank by Theorem 3 (which is not a very practical 
way) and check by row reduction. (Show details.) 


u v 

13. w u 

V w* 
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r s 4i 


21 . 


-2 


L 6 


-1 

3. 


(a) Line through two points. Derive from D = 0 in 
(12) the familiar formula 

■v ~ Ai _ y - y t 
“ *2 yi-te' 



‘ 2 

1 

0" 



22. 

13 

-13 

12 




.-3 

5 

-4. 




"0.4 

0 

-2.4 

3.0" 

23. 

1.2 

0.6 

0 


0.3 


_ 0 

1.2 

1.2 

0 . 


24. TEAM PROJECT. Geometrical Applications: 
Curves and Surfaces Through Given Points. The 
idea is to get an equation from the vanishing of 
the determinant of a homogeneous linear system as the 
condition for a nontrivial solution in Cramer’s theorem. 
We explain the trick for obtaining such a system for 
the case of a line L through two given points P x : (jq, y x ) 
and P 2 : (a' 2 , y 2 )- The unknown line is ax -I- by = — c, 
say. We write it as ax + by + c • 1 = 0. To get a 
nontrivial solution a , b , c, the determinant of the 
“coefficients” x, y, 1 must be zero. The system is 

ax + by + c * 1 = 0 (Line L) 

(12) ax i + by i -1- c • 1 =0 (P x on L) 

ax 2 + by 2 + c • 1 =0 (P 2 on L). 


(b) Plane. Find the analog of (12) for a plane through 
three given points. Apply it when the points are (1,1,1), 
(3, 2, 6), (5, 0, 5). 

(c) Circle. Find a similar formula for a circle in the 
plane through three given points. Find and sketch the 
circle through (2, 6), (6, 4), (7. 1 ). 

(d) Sphere. Find the analog of the formula in (c) for 
a sphere through four given points. Find the sphere 
through (0, 0, 5), (4, 0, 1), (0, 4, 1), (0, 0, -3) by this 
formula or by inspection. 

(e) General conic section. Find a formula for a 
general conic section (the vanishing of a determinant 
of 6th order). Try it out for a quadratic parabola and 
for a more general conic section of your own choice. 

25. WRITING PROJECT. General Properties of 
Determinants. Illustrate each statement in Theorems 
1 and 2 with an example of your choice. 

26. CAS EXPERIMENT. Determinant of Zeros and 
Ones. Find the value of the determinant of the n X n 
matrix A n with main diagonal entries all 0 and all others 
I. Try to find a formula for this. Try to prove it by 
induction. Interpret A 3 and A 4 as “incidence matrices ” 
(as in Problem Set 7. 1 but without the minuses) of a 
triangle and a tetrahedron, respectively; similarly for 
an “n-simplex”, having n vertices and n(n — l)/2 edges 
(and spanning R n ~ l , n = 5, 6, * * *)• 


7.8 Inverse of a Matrix. 

Gauss-Jordan Elimination 

In this section we consider square matrices exclusively . 

The inverse of an n X n matrix A = [aj k ] is denoted by A"” 1 and is an n X n matrix 
such that 

(1) AA" 1 = A~ X A = I 

where I is the n X n unit matrix (see Sec. 7.2). 

If A has an inverse, then A is called a nonsingular matrix. If A has no inverse, then 
A is called a singular matrix. 

If A has an inverse , the inverse is unique . 

Indeed, if both B and C are inverses of A, then AB = I and CA = I, so that we obtain 
the uniqueness from 


B = IB = (CA)B = C(AB) = Cl = C. 
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PROOF 


We prove next that A has an inverse (is nonsingular) if and only if it has maximum 
possible rank n. The proof will also show that Ax = b implies x = A" 1 b provided A” 1 
exists, and will thus give a motivation for the inverse as well as a relation to linear systems. 
(But this will not give a good method of solving Ax = b numerically because the Gauss 
elimination in Sec. 7.3 requires fewer computations.) 


Existence of the Inverse 

The inverse A” 1 of an n X n matrix A exists if and only if rank A = n, thus (by 
Theorem 3, Sec. 7.7) if and only if det A =£ 0. Hence A is nonsingular if rank A = n, 
and is singular /frank A < n. 


Let A be a given n X n matrix and consider the linear system 

(2) Ax = b. 

If the inverse A” 1 exists, then multiplication from the left on both sides and use of (1) 
gives 

A -1 Ax = x = A _1 b. 

This shows that (2) has a unique solution x. Hence A must have rank n by the Fundamental 
Theorem in Sec. 7.5. 

Conversely, let rank A = n. Then by the same theorem, the system (2) has a unique 
solution x for any b. Now the back substitution following the Gauss elimination (Sec. 7.3) 
shows that the components xj of x are linear combinations of those of b. Hence we can 
write 

(3) x = Bb 
with B to be determined. Substitution into (2) gives 


Ax = A(Bb) = (AB)b = Cb = b (C = AB) 

for any b. Hence C = AB = I, the unit matrix. Similarly, if we substitute (2) into (3) we 
get 

x = Bb = B(Ax) = (BA)x 

for any x (and b = Ax). Hence BA = I. Together, B = A^ 1 exists. ■ 


3 WILHELM JORDAN (1842-1899), German mathematician and geodesist. [See American Mathematical 
Monthly 94 ( 1 987), 1 30-1 42.] 

We do not recommend it as a method for solving systems of linear equations, since the number of operations 
in addition to those of the Gauss elimination is larger than that for back substitution, which the Gauss-Jordan 
elimination avoids. See also Sec. 20.1. 
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Determination of the Inverse 
by the Gauss-Jordan Method 

For the practical determination of the inverse A” 1 of a nonsingular n X n matrix A we 
can use the Gauss elimination (Sec. 7.3), actually a variant of it, called the Gauss-Jordan 
elimination 3 (footnote of p. 316). The idea of the method is as follows. 

Using A, we form n linear systems 

Ax (1 ) = e (1) , * * * , Ax (n) = e (w) 

where e cl) , • • • , e (n) are the columns of the n X n unit matrix I; thus, 
e d> — D 0 • • • Of, e (2 > = [0 1 0 • • • Of, etc. These are n vector equations 

in the unknown vectors x a) , ♦ • • , x (n) . We combine them into a single matrix equation 
AX = I, with the unknown matrix X having the columns x (1) , • • • , x (n) . 
Correspondingly, we combine the n augmented matrices [A e (1) ], • • • , [A e (n> ] into 
one n X In “augmented matrix” A = [A I]. Now multiplication of AX = I by A -1 
from the left gives X = A -1 I = A” 1 . Hence, to solve AX = I for X, we can apply the 
Gauss elimination to A = [A I]. This gives a matrix of the form [U H] with upper 
triangular U because the Gauss elimination triangularizes systems. The Gauss-Jordan 
method reduces U by further elementary row operations to diagonal form, in fact to the 
unit matrix I- This is done by eliminating the entries of U above the main diagonal and 
making the diagonal entries all 1 by multiplication (see the example below). Of course, 
the method operates on the entire matrix [U H], transforming H into some matrix K, 
hence the entire [U H] to [I K]. This is the “augmented matrix” of IX = K. Now 
IX = X = A” 1 , as shown before. By comparison, K = A“\ so that we can read A -1 
directly from [I K]. 

The following example illustrates the practical details of the method. 

EXAMPLE 1 Inverse of a Matrix. Gauss-Jordan Elimination 

Determine the inverse A -1 of 

'-I I 
A = 3-1 

_-l 3 

Solution . We apply the Gauss elimination (Sec. 7.3) to the following n X 2n = 3 X 6 matrix, where BLUE 
always refers to the previous matrix. 

"-1 12 100 

[A IJ = 3—1 I 010 

.-1 3 4 0 0 1 

2 1 0 O’ 

7 3 10 Row 2-1-3 Row 1 

2 - 1 0 l . Row 3 — Row 1 

2 I 0 01 

7 3 10 




0 0-5 


-4 -1 l 


Row 3 - Row 2 
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This is [U H] as produced by the Gauss elimination. Now follow the additional Gauss-Jordan steps, reducing 
U to 1, that is, to diagonal fonn with entries 1 on the main diagonal. 


‘1 

-1 

-2 

-1 

0 

0’ 

— Row 1 

0 

1 

3.5 

1.5 

0.5 

0 

0.5 Row 2 

.0 

0 

1 

0.8 

0.2 

-0.2. 

-0.2 Row 3 

“1 

-1 

0 

0.6 

0.4 

-0.4“ 

Row 1 + 2 Row 3 

0 

1 

0 

-1.3 

-0.2 

0.7 

Row 2 - 3.5 Row 3 

.0 

0 

l 

0.8 

0.2 

-0.2. 


"1 

0 

0 

-0.7 

0.2 

0.3" 

Row 1 + Row 2 

0 

1 

0 

-1.3 

-0.2 

0.7 


.0 

0 

1 

0.8 

0.2 

-0.2. 



The last three columns constitute A x . Check: 


"-1 1 2" 


1 

p 

0 

to 

0 

UJ> 


0 

0 

3 -1 1 


-1.3 -0.2 0.7 

= 

0 1 0 

_-l 3 4. 


. 0.8 0.2 -0.2. 


.0 0 L 


Hence AA -1 = I. Similarly, A -1 A = I. ■ 

Useful Formulas for Inverses 

The explicit formula (4) in the following theorem is often useful in theoretical studies (as 
opposed to computing inverses). In fact, the special case n = 2 occurs quite frequently in 
geometrical and other applications. 


Inverse of a Matrix 

The inverse of a nonsingular n X n matrix A = [a jk ] is given by 


( 4 ) A*” 1 = 


1 t 1 

— [C jk ) T = — 


det A 


det A 


Qi 

C12 


L C ln 


C21 

C22 

^2 n 


Cnl 
Cn 2 

CnnJ 


where Cj k is the cofactor of a j k in det A (see Sec. 7.7). (CAUTION! Note well that 
in A -1 , the cofactor Cj k occupies the same place as a ^ (not aj k ) does in A.) 

In particular , the inverse of 


A — 

'0n 

a 12 

a -1 - * 

a 22 

— 012 _ 

A — 

_ fl 21 

& 22 _ 

IS A — 

det A 

°21 

*11. 
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PROOF 


EXAMPLE 2 


EXAMPLE 3 


We denote the right side of (4) by B and show that BA = I. We first write 
(5) BA = G = [g kl ] 

and then show that G = I. Now by the definition of matrix multiplication and because of 
the form of B in (4), we obtain (CAUTION! C sk , not C^) 

n r . 

« . ^ &$i i . a. (a n C lk 4“ • • • + o n iC nk ). 

S=1 

Now (9) and (10) in Sec. 7.7 show that the sum (• • •) on the right is D = det A when 
/ = k, and is zero when / ¥= k. Hence 


1 


Skk ~ 


det A = 1, 


det A 

gkl = 0 Q* k). 

In particular, for n = 2 we have in (4) in the first row C u = a 2 2 , C 21 = —a 12 and in 
the second row C 12 = —a 2 19 C 22 = a lv This gives (4*)- ■ 

Inverse of a 2 x 2 Matrix 

*-r 'l. A-..±r 4 -'i.r°- 4 -° j i 

L2 4J 10 L-2 3j L— 0.2 0.3 J 

Further Illustration of Theorem 2 

Using (4), find the inverse of 


A = 


-1 1 
3 -J 
-1 3 


Solution . We obtain det A = - 1(-7) - l • 13 + 2 • 8 = 10, and in (4), 



-1 1 

11 2 


1 2 

= 


II 

1 

-j 

II 

1 

= 2 . c 31 = 



3 4 

13 4 


-1 1 


= 3. 


Cl2 “ 


3 1 

-1 4 


— — 13, C 2 2 — 


= - 2 , 


-1 2 

3 I 


= 7, 


3 -1 

-1 I 


-1 1 


= 8, C23 = — 

— 2, C33 — 


-1 3 

-1 3 


3 -1 


c l 3 ~ 

so that by (4), in agreement with Example 1, 

A" x = 


- " 2 , 


-0.7 0.2 0.3' 

—1.3 -0.2 0.7 


L 0.8 0.2 - 0 . 2 J 
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EXAMPLE 4 


PROOF 


Diagonal matrices A = [a j7c ], a jk = 0 when j =£ k, have an inverse if and only if all 
ajj =£ 0. Then A” 1 is diagonal, too, with entries l/a n , • • • , 1 /a nn . 


For a diagonal matrix we have in (4) 


Cll = ^22 ^7in _ j 

£> ^11^22 * • * &nn a ll 


etc. 


Inverse of a Diagonal Matrix 

Let 


Then the inverse is 



Products can be inverted by talcing the inverse of each factor and multiplying these 
inverses in reverse order, 

(7) (AC)" 1 = C -1 A -1 . 


Hence for more than two factors, 

(8) (AC • • • PQ)" 1 = Q -1 P -1 • • • C^A" 1 . 


The idea is to start from ( I ) for AC instead of A, that is, AC(AC) 1 = I, and multiply 
it on both sides from the left, first by A -1 , which because of A -1 A = I gives 

A -1 AC(AC) _1 = C(AC)" 1 
= A -1 I = A -1 , 

and then multiplying this on both sides from the left, this time by C -1 and by using 
C _1 C = I, 

C-kXAC)- 1 = (AC)- 1 = C _1 A _1 . 

This proves (7), and from it, (8) follows by induction. ■ 

We also note that the inverse of the inverse is the given matrix, as you may prove, 

(9) (A" 1 )’ 1 = A. 
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THEOREM 3 


PROOF 


Unusual Properties of Matrix Multiplication. 
Cancellation Laws 

Section 7.2 contains warnings that some properties of matrix multiplication deviate from 
those for numbers, and we are now able to explain the restricted validity of the so-called 
cancellation laws [2.] and [3.] below, using rank and inverse, concepts that were not yet 
available in Sec. 7.2. The deviations from the usual are of great practical importance and 
must be carefully observed. They are as follows. 

[1.] Matrix multiplication is not commutative, that is, in general we have 

AB * BA. 

[2.] AB = 0 does not generally imply A = 0 or B = 0 (or BA = 0); for example, 

c ari-K 3- 

[3.] AC = AD does not generally imply C = D (even when A ^ 0). 

Complete answers to [2.] and [3.] are contained in the following theorem. 


Cancellation Laws 

Let A, B, C be n X n matrices . Then: 

(a) /frank A = n and AB = AC, then B = C. 

(b) /frank A = n, then AB = 0 implies B = 0. Hence if AB = 0, but A # 0 
as well as B # 0, then rank A < n and rank B < n. 

(c) If A is singular, so are BA and AB. 


(a) The inverse of A exists by Theorem 1. Multiplication by A 1 from the left gives 
A -1 AB = A- X AC, hence B = C. 

(b) Let rank A = n. Then A -1 exists, and AB = 0 implies A _1 AB = B = 0. Similarly 
when rank B = n. This implies the second statement in (b). 

(c x ) Rank A < n by Theorem 1. Hence Ax = 0 has nontrivial solutions by Theorem 2 
in Sec. 7.5. Multiplication by B shows that these solutions are also solutions of BAx = 0, 
so that rank (BA) < n by Theorem 2 in Sec. 7.5 and BA is singular by Theorem 1. 

(c 2 ) A T is singular by Theorem 2(d) in Sec. 7.7. Hence B T A T is singular by part (c x ), 
and is equal to (AB) T by (lOd) in Sec. 7.2. Hence AB is singular by Theorem 2(d) in 
Sec. 7.7. ■ 

Determinants of Matrix Products 

The determinant of a matrix product AB or BA can be written as the product of the 
determinants of the factors, and it is interesting that del AB = det BA, although AB BA 
in general. The corresponding formula (10) is needed occasionally and can be obtained 
by Gauss-Jordan elimination (see Example 1) and from the theorem just proved. 
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Determinant of a Product of Matrices 

For any n X n matrices A and B, 

(10) det (AB) = del (BA) = det A det B. 


PROOF IfAorBis singular, so are AB and BA by Theorem 3(c), and (10) reduces to 0 = 0 by 
Theorem 3 in Sec. 7.7. 

Now let A and B be nonsingular. Then we can reduce A to a diagonal matrix A = [a jk ] 
by Gauss-Jordan steps. Under these operations, det A retains its value, by Theorem l in 
Sec. 7.7, (a) and (b) [not (c)] except perhaps for a sign reversal in row interchanging when 
pivoting. But the same operations reduce AB to AB with the same effect on det (AB). 
Hence it remains to prove (10) for AB; written out, 



"flu 

0 

0 “ 


~bn 

bi 2 

bln 

Ab = 

0 

<?22 

0 


^21 

b 2 2 



_ 0 

0 



1-4 

1 

bn 2 

bnn _ 



r 

^11^12 


Qllbln 1 



^ 22^21 # 22^22 * * * &22p2n 




a nn^n2 


^nn^nn J 


We now take the determinant det (AB). On the right we can take out a factor a lx from 
the first row, a 2 2 from the second, • • • , a nn from the nth. But this product <5 n a 22 m 9 * S nn 
equals det A because A is diagonal. The remaining determinant is det B. This proves (10) 
for det (AB), and the proof for det (BA) follows by the same idea. ■ 

This completes our discussion of linear systems (Secs. 7.3-7.8). Section 7.9 on vector 
spaces and linear transformations is optional. Numeric methods are discussed in Secs. 
20.1-20.4, which are independent of other sections on numerics. 


FFB 


1-12 


INVERSE 


Find the inverse by Gauss-Jordan [or by (4*) if n = 2] or 
state that it does not exist. Check by using (1). 


fl.20 

4.641 

ro.6 

0.8“ 

1. 


2. 


L0.50 

3.60J 

Lo.8 

-0.6. 


3. 


cos 20 
-sin 20 


sin 2 O' 
cos 20. 


2 

3 


-2 


3 


1 
3 

2 

3 


2 

3 


-ij 


4. 
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3 

-1 

r 


" 29 

-11 

10 ' 

5. 

— 15 

6 

-5 

6. 

-160 

61 

-55 


5 

-2 

2 _ 


55 

-21 

19 . 


7. 


I 

9 


0 

1 


0 “ 

0 


L5 4 lj 


n 2 51 


8. 


0 

9 


-1 

4 


2 

11 


9. 


"0 

1 

.0 


1 

0 

0 


0 

0 

1. 


10 . 


“0 

0 

2 


8 

0 

0 


O' 

4 

0. 


11 . 


I 

0 

9 


2 

-1 

4 


5' 

2 

10 . 



13. (Triangular matrix) Is the inverse of a triangular 
matrix always triangular (as in Prob. 7)? Give reason. 

14. (Rotation) Give an application of the matrix in Prob. 
3 that makes the form of its inverse obvious. 

15. (Inverse of the square) Verify (A 2 )” 1 = (A” 1 ) 2 for 
A in Prob. 5. 

16. Prove the formula in Prob. 15. 

17. (Inverse of the transpose) Verify (A T ) 1 = (A” 1 ) 1 ^ 
for A in Prob. 5. 

18. Prove the formula in Prob. 17. 

19. (Inverse of the inverse) Prove that (A -1 )” 1 = A. 

20. (Row interchange) Same question as in Prob. 14 for 
the matrix in Prob. 9. 


21-23 


EXPLICIT FORMULA (4) FOR THE 
INVERSE 


Formula (4) is generally not very practical. To understand 
its use. apply it: 

21. To Prob. 9. 22. To Prob. 4. 23. To Prob. 7. 


7.5 Vector Spaces, Inner Product Spaces, 
Linear Transformations Optional 


In Sec. 7.4 we have seen that special vector spaces arise quite naturally in connection 
with matrices and linear systems, that their elements, called vectors , satisfy rules quite 
similar to those for numbers [(3) and (4) in Sec. 7.1], and that they are often obtained as 
spans (sets of linear combinations) of finitely many given vectors. Each such vector has 
n real numbers as its components. Look this up before going on. 

Now if we take all vectors with n real numbers as components (“real vectors ”), we 
obtain the very important real u-dimensional vector space R n . This is a standard name 
and notation. Thus, each vector in R n is an ordered /i-tuple of real numbers. 

Particular cases are # 2 , the space of all ordered pairs (“vectors in the plane”) and R 3 , 
the space of all ordered triples (“vectors in 3-space”). These vectors have wide applications 
in mechanics, geometry, and calculus that are basic to the engineer and physicist. 

Similarly, if we take all ordered n-tuples of complex numbers as vectors and complex 
numbers as scalars, we obtain the complex vector space C r \ which we shall consider in 
Sec. 8.5. 

This is not all. There are other sets of practical interest (sets of matrices, functions, 
transformations, etc.) for which addition and scalar multiplication can be defined in a 
natural way so that they form a “vector space”. This suggests to create from the “concrete 
model” R n the “abstract concept” of a “real vector space” V by taking the basic properties 
(3) and (4) in Sec. 7.1 as axioms. These axioms guarantee that one obtains a useful and 
applicable theory of those more general situations. Note that each axiom expresses a simple 
property of R n or, as a matter of fact, of R 3 . Selecting good axioms needs experience and 
is a process of trial and error that often extends over a long period of time. 
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DEFINITION 


Real Vector Space 

A nonempty set V of elements a, b, • • • is called a real vector space (or real linear 
space), and these elements are called vectors (regardless of their nature, which will 
come out from the context or will be left arbitrary) if in V there are defined two 
algebraic operations (called vector addition and scalar multiplication) as follows. 

I. Vector addition associates with every pair of vectors a and b of V a unique 
vector of V , called the sum of a and b and denoted by a + b, such that the following 
axioms are satisfied. 

1.1 Commutativity . For any two vectors a and b of V, 

a + b = b 4- a. 


1.2 Associativity . For any three vectors u, v, w of V , 

(u + v) + w = u + (v + w) (written u + v + w). 

1.3 There is a unique vector in V, called the zero vector and denoted by 0, such 
that for every a in V % 

a + 0 = a. 

1.4 For every a in V there is a unique vector in V that is denoted by —a and is 
such that 


a + (-a) = 0. 

n. Scalar multiplication. The real numbers are called scalars. Scalar 
multiplication associates with every a in V and every scalar c a unique vector of V, 
called the product of c and a and denoted by ca (or ac) such that the following 
axioms are satisfied. 

II.l Distributivity. For every scalar c and vectors a and b in V , 

c(a + b) = ca 4- cb. 

H.2 Distributivity . For all scalars c and k and every a in V , 

(c + k) a = ca 4- £a. 

n.3 Associativity. For all scalars c and k and every a in V , 

c(ka) = (ck) a (written cksi). 

H.4 For every a in V, 

la = a. 


A complex vector space is obtained if, instead of real numbers, we take complex numbers 
as scalars. 
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EXAMPLE I 


EXAMPLE 2 


Basic concepts related to the concept of a vector space are defined as in Sec. 7.4. 

A linear combination of vectors a (1) , , a (m) in a vector space V is an 

expression 

c x a (1) + * * * + a (m) (c lf • • • , c m any scalars). 

These vectors form a linearly independent set (briefly, they are called linearly 
independent) if 

(1) c x a ( i) + • • • 4- c m a (m) = 0 

implies that c x = 0, • • • , c m = 0. Otherwise, if (1) also holds with scalars not all zero, 
the vectors are called linearly dependent. 

Note that (1) with m = 1 is ca = 0 and shows that a single vector a is linearly 
independent if and only if a # 0. 

V has dimension n, or is ^-dimensional, if it contains a linearly independent set of n 
vectors, whereas any set of more than n vectors in V is linearly dependent. That set of n 
linearly independent vectors is called a basis for V. Then every vector in V can be written 
as a linear combination of the basis vectors; for a given basis, this representation is unique 
(see Prob. 14). 

Vector Space of Matrices 

The real 2X2 matrices form a four-dimensional real vector space. A basis is 



because any 2X2 matrix A = [«j fc ] has a unique representation A = a n B n + ^ 12^12 + a zi^2i + fl 22®22- 
Similarly, the real m X n matrices with fixed m and n form an m/i-dimensional vector space. What is the 
dimension of the vector space of all 3 X 3 skew-symmetric matrices? Can you find a basis? M 

Vector Space of Polynomials 

The set of all constant, linear, and quadratic polynomials in a* together is a vector space of dimension 3 with 
basis { 1. a, a 2 } under the usual addition and multiplication by real numbers because these two operations give 
polynomials not exceeding degree 2. What is the dimension of the vector space of all polynomials of degree 
not exceeding a given fixed /i? Can you find a basis? M 

If a vector space V contains a linearly independent set of n vectors for every n , no matter 
how large, then V is called infinite dimensional, as opposed to a finite dimensional 
(/7-dimensional) vector space just defined. An example of an infinite dimensional vector 
space is the space of all continuous functions on some interval [a, b ] of the A*-axis, as we 
mention without proof. 


Inner Product Spaces 


If a and b are vectors in R n t regarded as column vectors, we can form the product a T b. 
This is a 1 X 1 matrix, which we can identify with its single entry, that is, with a number. 
This product is called the inner product or dot product of a and b. Other notations for 
it are (a, b) and a*b. Thus 


a T b = (a, b) = a*b = [a x • • • a n ] 


= 2 a ih = a-ibx + 
1=1 


c n b n . 
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We now extend this concept to general real vector spaces by taking basic properties of 
(a, b) as axioms for an “abstract inner product” (a, b) as follows. 


DEFINITION 


Real Inner Product Space 

A real vector space V is called a real inner product space (or real pre-Hilbert 4 
space) if it has the following property. With every pair of vectors a and b in V there 
is associated a real number, which is denoted by (a, b) and is called the inner 
product of a and b, such that the following axioms are satisfied. 

I. For all scalars q x and q 2 and all vectors a, b, c in V \ 

( 4 ia 4- q 2 b, c) = q ± ( a, c) 4- q 2 (b , c) ( Linearity ). 

II. For all vectors a and b in V, 


III. For every a in V y 


(a, b) = (b, a) 


{Symmetry). 


(a, a) ^ 0, 

(a, a) = 0 if and only if a = 0 


( Positive-definiteness ). 


Vectors whose inner product is zero are called orthogonal. 

The length or norm of a vector in V is defined by 

(2) || a || =V(aTa) (^0). 

A vector of norm 1 is called a unit vector. 

From these axioms and from (2) one can derive the basic inequality 

(3) |(a, b)| ^ || a || || b || (Cauchy-Schwan 5 inequality). 

From this follows 

(4) || a + b||^||a|| 4- ||b|| ( Triangle inequality ). 

A simple direct calculation gives 

(5) || a 4- b || 2 4- || a — b|| 2 = 2( ||a|| 2 + ||b|| 2 ) ( Parallelogram equality). 


frjAVID HILBERT (1862-1943), great German mathematician, taught at Konigsberg and Gottingen and was 
the creator of the famous Gottingen mathematical school. He is known for his basic work in algebra, the calculus 
of variations, integral equations, functional analysis, and mathematical logic. His “Foundations of Geometry" 
helped the axiomatic method to gain general recognition. His famous 23 problems (presented in 1900 at the 
International Congress of Mathematicians in Paris) considerably influenced the development of modern 
mathematics. 

Tf V is finite dimensional, it is actually a so-called Hilbert space; see Ref. [GR7], p. 73, listed in App. I. 

5 HERMANN AMANDUS SCHWARZ (1843-1921). German mathematician, known by his work in complex 
analysis (conformal mapping) and differential geometry. For Cauchy see Sec. 2.5. 
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EXAMPLE 3 


EXAMPLE 4 


n-Dimensional Euclidean Space 

R n with the inner product 

(6) (a, b) = a T b = a\bi + • • • + o n b n 

(where both a and b are column vectors) is called the n -dimensional Euclidean space and is denoted by E n 
or again simply by R n . Axioms T-TTI hold, as direct calculation shows. Equation (2) gives the “Euclidean norm” 

(7) HI = V(a, a) = Va'a = V«j 2 + • • • + a n 2 . ■ 


An Inner Product for Functions. Function Space 

The set of all real-valued continuous functions /( jc), g(*), • • • on a given interval a ^ x ^ is a real vector 
space under the usual addition of functions and multiplication by scalars (real numbers). On this “function 
space” we can define an inner product by the integral 

(8) (/. g) = / f(.x)g(x)dx. 


Axioms I— III can be verified by direct calculation. Equation (2) gives the norm 


(9) 


ll/ll =V(fJ) = 



}(xf dx. 


■ 


Our examples give a first impression of the great generality of the abstract concepts of 
vector spaces and inner product spaces. Further details belong to more advanced courses 
(on functional analysis, meaning abstract modern analysis; see Ref. [GR7] listed in App. 1) 
and cannot be discussed here. Instead we now take up a related topic where matrices play 
a central role. 


Linear Transformations 

Let X and Y be any vector spaces. To each vector x in X we assign a unique vector y in 
Y. Then we say that a mapping (or transformation or operator) of X into Y is given. 
Such a mapping is denoted by a capital letter, say F. The vector y in Y assigned to a vector 
x in X is called the image of x under F and is denoted by F(x) [or Fx, without parentheses]. 

F is called a linear mapping or linear transformation if for all vectors v and x in X 
and scalars c. 


( 10 ) 


F(y + x) = F(v) + F(x) 
F(cx) = cF(x). 


Linear Transformation of Space R n into Space R m 

From now on we let X = R n and Y = R m . Then any real m X n matrix A = [Oj fc ] gives 
a transformation of R n into R m , 

(11) y = Ax. 


Since A(u + x) = Au + Ax and A(cx) = cAx, this transformation is linear. 

We show that, conversely, every linear transformation F of R n into R m can be given 
in terms of an m X n matrix A, after a basis for R n and a basis for R m have been chosen. 
This can be proved as follows. 
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Let e^, • • • , e Cn) be any basis for R n . Then every x in R n has a unique representation 

X — + • * * H" A' n C( n ). 

Since F is linear, this representation implies for the image F(x): 

F(x) = Fixjeay + • • • + x„e (n) ) = *iF(e a >) + • • • + x n F(^ n) ). 

Hence F is uniquely determined by the images of the vectors of a basis for R n . We now 
choose for R n the “standard basis” 



T 


"0“ 


"o' 


0 


1 


0 

II 

cT 

N 

rH 

s— ✓ 

0 

» e (2) - 

0 

> ‘ ‘ * i ®(n) 

0 


.0. 


.0. 


.1. 


where e (J) has its j th component equal to 1 and all others 0. We show that we can now 
determine an m X n matrix A = [a jk ] such that for eveiy x in R n and image y = F(x) in 

y = F(x) = Ax. 

Indeed, from the image y (1) = F(e (1) ) of e (1) we get the condition 




an - 

' a l n 


V 

yft 3 

= 

a 21 

a 2 n 


0 



* ' 

a^mn _ 


_o_ 


from which we can determine the first column of A, namely a xl = y i 1> , a 21 — y£ l> » ‘ * * > 
a mi = yin*- Similarly, from the image of e (2 > we get the second column of A, and so on. 
This completes the proof. ■ 

We say that A represents F, or is a representation of F, with respect to the bases for R n 
and R m . Quite generally, the purpose of a “representation” is the replacement of one 
object of study by another object whose properties are more readily apparent. 

In three-dimensional Euclidean space F 3 the standard basis is usually written e (1) = i, 
e (2 ) = j> e (3) = k- Thus, 


T 


‘0‘ 


'0‘ 

0 

. j = 

1 

k = 

0 

.0. 


.0. 


.1. 
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These are the three unit vectors in the positive directions of the axes of the Cartesian 
coordinate system in space, that is, the usual coordinate system with the same scale of 
measurement on the three mutually perpendicular coordinate axes. 


EXAMPLE 5 Linear Transformations 

Interpreted as transformations of Cartesian coordinates in the plane, the matrices 


o 

o 

o 

7 

0" 

o 

o 

7 

o 

-0 I- 


represent a reflection in the line .v 2 = a reflection in the Ax-axis, a reflection in the origin, and a stretch 
(when a > 1, or a contraction when 0 < a < 1 ) in the A- r direction, respectively. I 

EXAMPLE 6 Linear Transformations 

Our discussion preceding Example 5 is simpler than it may look at first sight. To see this, find A representing 
the linear transformation that maps (.Vj. a 2 ) onto (Ivj - 5a 2 . 3aj + 4a 2 ). 

Solution . Obviously, the transformation is 


>'l ~ 2*1 “ 5a- 2 
.V2 = 3a*i + 4a 2 . 

From this we can directly sec that the matrix is 

2x1 ~ 5V2 1 ■ 
3a-! + 4a 2 J 

If A in (1 1) is square, n X then (11) maps R n into R n . If this A is nonsingular, so that 
A” 1 exists (see Sec. 7.8), then multiplication of (11) by A"" 1 from the left and use of 
A“ 1 A = I gives the inverse transformation 

(14) x = A -1 y. 

It maps every y = y 0 onto that x, which by (1 1) is mapped onto y 0 . The inverse of a linear 
transformation is itself linear, because it is given by a matrix, as (14) shows. 




2 -5 

3 4 


Check: 


7i 

_ 72 J 


2 -5 

3 4 


*1 


PROBLEM SET 7.9 


VECTOR SPACES 

(Additional problems in Problem Set 7.4.) 

Is the given set (taken with die usual addition and scalar 
multiplication) a vector space? (Give a reason.) If your 
answer is yes, find the dimension and a basis. 

1. All vectors in R* satisfying 5v y - 3v 2 + 2u 3 = 0 

2. All vectors in R 3 satisfying 2i>! + 3v 2 - v 3 = 0, 
v i ~ 4t>2 + v s = 0 

3. All 2 X 3 matrices with all entries nonnegative 

4. AU symmetric 3X3 matrices 

5. All vectors in R 5 with the first three components 0 


6. All vectors in R 4 with v y + v 2 = 0, v 3 - u 4 = 1 

7. Ail skew-symmetric 2X2 matrices 

8. All n X n matrices A with fixed n and det A =0 

9. All polynomials with positive coefficients and degree 
3 or less 

10. All functions f(x ) = a cos a* -f b sin a with any 
constants a and b 

11. All functions /(a) = (ax + b)e~ x with any constants 
a and b 

12. All 2 X 3 matrices with the second row any multiple 
of [4 0 -9] 
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13. (Different bases) Find three bases for R 2 . 

14. (Uniqueness) Show that the representation 
v = c i a d> + ■ ■ ■ + c n a (n) of any given vector in 
an w-dimensional vector space V in terms of a given 
basis a (1) , • • • , a (n) for V is unique. 

1 5-20 1 LINEAR TRANSFORMATIONS 

Find the inverse transformation. (Show the details of your 

work.) 

15. y x = x x ~ 2*2 16. y 1 = 5* x - x 2 

y 2 ~ 4*i - 3*2 y 2 = 3*! - * 2 

17. y x = 3* t - * 2 18. )’i = 0.25*x - 0.1*3 

y 2 ~ — 5*! 4- 2*2 y 2 = *2 — 0.8*3 

y 3 = 0.2*3 

19. y 1 = 2*! — 3*2 

y 2 = - 10*! + 16*2 + * 3 

y 3 = -7*! + 11*2 + *3 


20. y x - *i "h *2 2*3 

y 2 = *1 + *2 + 2*3 

y 3 = -2*! + 2*2 4- 4*3 

2 1-26 1 INNER PRODUCT. ORTHOGONALITY 

Find the Euclidean norm of the vectors 

21. [4 2 -6] T 

22. [0 -3 3 0 5 1] T 

23. [16 -32 0] T 

24. a | I 2] r 

25. [0 1 0 0 -1 1 -If 

26- [§ -I If 

27. (Orthogonality) Show that the vectors in Probs. 21 
and 23 are orthogonal. 

28. Find all vectors v in R 3 orthogonal to [2 0 1] T . 

29. (Unit vectors) Find all unit vectors orthogonal to 

[4 -3] T . Make a sketch. 

30. (Triangle inequality) Verify (4) for the vectors in 
Probs. 21 and 23. 


RE33t£SEEE3HB5SBE5TIONS AND PROBLEMS 


1. What properties of matrix multiplication differ from 
those of the multiplication of numbers? What about 
division of matrices? 

2. Let A be a 50 X 50 matrix and B a 50 X 20 matrix. 
Are the following expressions defined or not? A 4- B, 
A 2 , B 2 , AB, BA, AA T , B t A, B t B, BB t , B t AB. (Give 
reasons.) 

3. How is matrix multiplication motivated? 

4. Are there any linear systems without solutions? With 
one solution? With more than one solution? Give simple 
examples. 

5. How can you give the rank of a matrix in terms of row 
vectors? Of column vectors? Of determinants? 

6. What is the role of rank in connection with solving 
linear systems? 

7. What is the row space of a matrix? The column space? 
The null space? 

8. What is the idea of Gauss elimination and back 
substitution? 

9. What is the inverse of a matrix? When does it exist? 
How would you determine it? 

10. What is Cramer’s rule? When would you apply it? 

1 1 1-19 1 LINEAR SYSTEMS 

Find all solutions or indicate that no solution exists. (Show 

the details of your work.) 


11. 9* - 3y = 15 

5* + 4y = 48 

12. —2* — 4y -1- Iz = —6 

x + 2y + 1 6z = 3 

13. 3* + 5 y - 8z = 18 14. 5* - lOy = 2 

* + 2y -3 z= 6 3* 4- y = 13 

-* + 6y = 6 

15. -8* + 2z = 1 16. 2y + z = -1 

6y + 4z = 3 2* + 3y — z = — 12 

12* + 2y =2 5* - 4y + 3z = 32 

17. 3* + 7y = 0 18. -* 4* 4y - 2 Z = 1 

5* — 4 y = 47 3* 4- 4 y 4- 6z = I 

6* + 9y = 15 * — 2y 4- 2z = — 

19. 7* 4- 9y - \4z = 36 

-* - 3y + 2z = -12 
2* + y ~ 4z = 4 


CO|»— 
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20-30 1 CALCULATIONS WITH MATRICES AND |37-42| INVERSE 

VECTORS Find the inverse or state why it does not exist. (Show details.) 

Calculate the following expressions (showing the details of 37. Of the coefficient matrix in Prob. 1 1 

your work) or indicate why they do not exist, when 38 . of the coefficient matrix in Prob. 15 

39. Of the coefficient matrix in Prob. 16 

40. Of the coefficient matrix in Prob. 18 

41. Of the augmented matrix in Prob. 14 

42. Of the diagonal matrix with entties 3, — L 5 




20. AB, BA 21. A - A T 

22. A 2 4- B 2 23. det A, del B, det AB 

24. AA t , A t A 25. 0.2BB T 

26. Aa, a T A, a T Aa 27. a T b, b T a, ab T 

28. b T Bb 29. a T B, B T a 

30. 0.1(A + A t )(B - B t ) 

3 1-36 1 RANK 

Determine the ranks of the coefficient matrix and the 
augmented matrix and state how many solutions the linear 
system will have. 

31. In Prob. 13 32. In Prob. 12 33. In Prob. 17 

34. In Prob. 14 35. In Prob. 19 36. In Prob. 18 


43—15 1 NETWORKS 

Find the currents in the following networks. 

43. ion 44. 3800 v 



45. 100 o 


VVV 

'A 

y|l020 V 

1 j' VVV 

7 3 30 0 

AAA 

“ r’|'540 V 

VVV 
20 0 



Linear Algebra: Matrices, Vectors, Determinants 
Linear Systems of Equations 


An m X n matrix A = [aj k ] is a rectangular array of numbers or functions (“entries”, 
“elements”) arranged in m horizontal rows and n vertical columns. If m = n, the 
matrix is called square. A 1 X n matrix is called a row vector and an m X I matrix 
a column vector (Sec. 7.1). 

The sum A + B of matrices of the same size (i.e., both m X n) is obtained by 
adding corresponding entries. The product of A by a scalar c is obtained by 
multiplying each by c (Sec. 7.1). 

The product C = AB of an m X n matrix A by an r X p matrix B = [bj k ] is 
defined only when r = n> and is the m X p matrix C = [c jfc ] with entries 


( 1 ) 


c jk a j\blk a j2&2k + * * * -F Uj n b n fc 


(row j of A times 
column k of B). 
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This multiplication is motivated by the composition of linear transformations 
(Secs. 7.2, 7.9). It is associative, but is not commutative: if AB is defined, BA may 
not be defined, but even if BA is defined, AB =f BA in general. Also AB = 0 may 
not imply A = 0 or B = 0 or BA = 0 (Secs. 7.2, 7.8). Illustrations: 


c ;r: _k a 
r: .3 c ;h:-:] 

" □" 21 -C 



The transpose A T of a matrix A = [cij k ] is A T = [a k j]\ rows become columns and 
conversely (Sec. 7.2). Here, A need not be square. If it is and A = A T , then A is called 
symmetric; if A = - A T , it is called skew-symmetric. For a product, (AB) T = B T A T 
(Sec. 7.2). 


A main application of matrices concerns linear systems of equations 
(2) Ax = b (Sec. 7.3) 


( m equations in n unknowns jq, • • • , jc n ; A and b given). The most important method 
of solution is the Gauss elimination (Sec. 7.3), which reduces the system to 
“triangular” form by elementary row operations , which leave the set of solutions 
unchanged. (Numeric aspects and variants, such as Doolittle's and Cholesky’s 
methods , are discussed in Secs. 20.1 and 20.2) 

Cramer’s rule (Secs. 7.6, 7.7) represents the unknowns in a system (2) of n 
equations in n unknowns as quotients of determinants; for numeric work it is 
impractical. Determinants (Sec. 7.7) have decreased in importance, but will retain 
their place in eigenvalue problems, elementary geometry, etc. 

The inverse A” 1 of a square matrix satisfies AA~ a = A“ 1 A = I. It exists if and 
only if det A 0. It can be computed by the Gauss-Jordan elimination (Sec. 7.8). 

The rank r of a matrix A is the maximum number of linearly independent rows 
or columns of A or, equivalently, the number of rows of the largest square submatrix 
of A with nonzero determinant (Secs. 7.4, 7.7). 

The system (2) has solutions if and only if rank A = rank [A b], where [A b] 
is the augmented matrix (Fundamental Theorem, Sec, 7.5). 

The homogeneous system 
(3) Ax = 0 


has solutions x^O (“nontrivial solutions”) if and only if rank A < «, in the case 
m = n equivalently if and only if det A = 0 (Secs. 7.6, 7.7). 

Vector spaces, inner product spaces, and linear transformations are discussed in 
Sec. 7.9. See also Sec. 7.4. 





CHAPTER 8 

Linear Algebra: 

Matrix Eigenvalue Problems 


Matrix eigenvalue problems concern the solutions of vector equations 
(1) Ax = Ax 

where A is a given square matrix and vector x and scalar A are unknown. Clearly, x = 0 
is a solution of (I), giving 0 = 0. But this of no interest, and we want to find solution 
vectors x ^ 0 of (1), called eigenvectors of A. We shall see that eigenvectors can be 
found only for certain values of the scalar A; these values A for which an eigenvector 
exists are called the eigenvalues of A. Geometrically, solving (1) in this way means that 
we are looking for vectors x for which the multiplication of x by the matrix A has the 
same effect as the multiplication of x by a scalar A, giving a vector Ax with components 
proportional to those of x, and A as the factor of proportionality. 

Eigenvalue problems are of greatest practical interest to the engineer, physicist, and 
madiematician, and we shall see that their theory makes up a beautiful chapter in linear 
algebra that has found numerous applications. 

We shall explain how to solve that vector equation (1) in Sec. 8.1, show a few typical 
applications in Sec. 8.2, and then discuss eigenvalue problems for symmetric, 
skew-symmetric, and orthogonal matrices in Sec. 8.3. In Sec. 8.4 we show how to obtain 
eigenvalues by diagonalization of a matrix. We also consider the complex counterparts of 
those matrices (Hermitian, skew-Hermitian, and unitary matrices, Sec. 8.5), which play a 
role in modem physics. 

COMMENT. Numerics for eigenvalues (Secs. 20.6-20.9) can be studied immediately 
after this chapter. 

Prerequisite: Chap. 7. 

Sections that may be omitted in a shorter course: 8.4, 8.5 

References and Answers to Problems: App. 1 Part B, App. 2. 
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CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems 


8.1 Eigenvalues, Eigenvectors 

From the viewpoint of engineering applications, eigenvalue problems are among the most 
important problems in connection with matrices, and the student should follow the present 
discussion with particular attention. We begin by defining the basic concepts and show how 
to solve these problems, by examples as well as in general. Then we shall turn to applications. 
Let A = [a jk ] be a given n X n matrix and consider the vector equation 

(1) Ax = Ax. 

Here x is an unknown vector and A an unknown scalar. Our task is to determine x’s and 
A’s that satisfy (1 ). Geometrically, we are looking for vectors x for which the multiplication 
by A has the same effect as the multiplication by a scalar A; in other words, Ax should 
be proportional to x. 

Clearly, the zero vector x = 0 is a solution of (1) for any value of A, because AO = 0. 
This is of no interest. A value of A for which ( 1 ) has a solution x =£ 0 is called an eigenvalue 
or characteristic value (or latent root) of the matrix A. (“Eigen” is German and means 
“proper” or “characteristic.”) The corresponding solutions x # 0 of (1) are called the 
eigenvectors or characteristic vectors of A corresponding to that eigenvalue A. The set 
of all the eigenvalues of A is called the spectrum of A. We shall see that the spectrum 
consists of at least one eigenvalue and at most of n numerically different eigenvalues. The 
largest of die absolute values of the eigenvalues of A is called the spectral radius of A, 
a name to be motivated later. 

How to Find Eigenvalues and Eigenvectors 

The problem of determining the eigenvalues and eigenvectors of a matrix is called an 
eigenvalue problem. (More precisely: an algebraic eigenvalue problem, as opposed to 
an eigenvalue problem involving an ODE, PDE (see Secs. 5.7 and 12.3) or integral 
equation.) Such problems occur in physical, technical, geometric, and other applications, 
as we shall see. We show how to solve them, first by an example and then in general. 
Some typical applications will follow afterwards. 

EXAMPLE 1 Determination of Eigenvalues and Eigenvectors 

We illustrate all the steps in terms of the matrix 



Solution . (a) Eigenvalues. These must be determined first. Equation (1) is 


Ax = 



in components, 


Transferring the terms on the right to the left, we get 


( 2 *) 


(-5 - \) Xl + lx 2 =0 
2.vx + (-2 - A)a* 2 = 0. 


—5xi + 2*2 = A*! 
l\l - 2*2 = A*2- 


This can be written in matrix notation 
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(3*) 


(A - AI)x= 0 


because (1) is Ax - Ax = Ax - AIx = (A - AI)x = 0, which gives (3*). We see that this is a homogeneous 
linear system. By Cramer’s theorem in Sec. 7.7 it has a nontrivial solution x =£ 0 (an eigenvector of A we are 
looking for) if and only if its coefficient determinant is zero, that is. 


(4*) 


D{ A) = det (A - AI) = 



2 

-2 - A 


= (-5 


A)(- 2 - A) - 4 = A 2 + 7A + 6 = 0. 


We call D{ A) the characteristic determinant or, if expanded, the characteristic polynomial, and D{ A) = 0 
the characteristic equation of A. The solutions of this quadratic equation are A x = — 1 and A 2 = —6. These 
are the eigenvalues of A. 

(bj) Eigenvector of A corresponding to A x . This vector is obtained from (2*) with A = A t = — 1, that is. 


-4*! + Zx 2 = 0 


2x x - x 2 = 0. 


A solution is x 2 ~ 2x lt as we see from cither of the two equations, so that we need only one of them. This 
determines an eigenvector corresponding to A x = - l up to a scalar multiple. If we choose x x = l, we obtain 
the eigenvector 





Check: 


’-5 2 i pi r-r 

AX X = = (- 

L 2 - 2 j L 2 J L- 2 J 


" l) x i — AxXj. 


(b 2 ) Eigenvector of A corresponding to A 2 . For A = A 2 = —6, equation (2*) becomes 


Xi + 2*2 — 0 

2x 1 -1- 4^ = 0. 


A solution is x 2 = — X]/2 with arbitrary x v If we choose = 2, we get jc 2 = — Thus an eigenvector of A 
corresponding to A 2 = —6 is 


x 2 = 



Check: 



21 (" 21 T-12‘ 

= = (-6)x 2 = A 2 x 2 . 

-2j L-lJ L 6j 


This example illustrates the general case as follows. Equation (1) written in components is 

a n x i + • * • + ci ln x n — Aa*j 

<321*1 + * • • + <*2 n*n = A* 2 


^nl*l 4" * * ' 4* o nn x n A.v n . 

Transferring the terms on the right side to the left side, we have 


( 2 ) 


(a n - A)*! 4- a^x 2 

#21*1 4* (#22 A)a 2 

a n\ x 1 4- a n2 x 2 


4- • • 

• 4- 

0 

II 

1 

+ • • 

• + 

a 2n x n — 0 

4- • • 

• 4- 

(^nn A)x n 0. 


In matrix notation, 


(3) 


(A - AI)x = 0. 
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THEOREM 2 


PROOF 


By Cramer’s theorem in Sec. 7.7, this homogeneous linear system of equations has a 
nontrivial solution if and only if the corresponding determinant of the coefficients is zero: 


(4) D( A) = det (A - AI) = 


#n A a 12 

#21 #22 ““ ^ 

#nl #n2 


#ln 

#2n 


#nn 4 


= 0 . 


A — AI is called the characteristic matrix and D(A) the characteristic determinant of 
A. Equation (4) is called the characteristic equation of A. By developing D( A) we obtain 
a polynomial of nth degree in A. This is called the characteristic polynomial of A. 

This proves the following important theorem. 


Eigenvalues 

The eigenvalues of a square matrix A are the roots of the characteristic equation 
(4) of A, 

Hence an n X n matrix has at least one eigenvalue and at most n numerically 
different eigenvalues. 


For larger «, the actual computation of eigenvalues will in general require the use 
of Newton’s method (Sec. 19.2) or another numeric approximation method in 
Secs. 20.7-20.9. 

The eigenvalues must be determined first. Once these are known, corresponding 
eigen vectors are obtained from the system (2), for instance, by the Gauss elimination, 
where A is the eigenvalue for which an eigenvector is wanted. This is what we did in 
Example 1 and shall do again in the examples below. (To prevent misunderstandings: 
numeric approximation methods (Sec. 20.8) may determine eigen vectors first.) 

Eigenvectors have the following properties. 


Eigenvectors, Eigenspace 

If vt and x are eigenvectors of a matrix A corresponding to the same eigenvalue A, 
so are w -I- x (provided x^-w) and kxfor any k =£ 0. 

Hence the eigenvectors corresponding to one and the same eigenvalue A of A, 
together with 0, form a vector space (cf. Sec. 7.4), called the eigenspace of A 
corresponding to that A. 


Aw = Aw and Ax = Ax imply A(w 4 x) = Aw + Ax = Aw 4* Ax = A(w 4- x) and 

A(/:w) = k( Aw) = k( Aw) = A(fcw); hence A(kv? 4 tx) = X(kw 4 €x). ■ 

In particular, an eigenvector x is determined only up to a constant factor. Hence we can 
normalize x, that is, multiply it by a scalar to get a unit vect or (see Sec. 7.9). For 

instance, x 1 = [1 2] T in Example 1 has the length || x : || = Vl 2 4 2 2 = V5; hence 

[l/V5 2 /V 5 ] is a normalized eigenvector (a unit eigenvector). 
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EXAMPLE 2 


Examples 2 and 3 will illustrate that an n X n matrix may have n linearly independent 
eigenvectors, or it may have fewer than n. In Example 4 we shall see that a real matrix 
may have complex eigenvalues and eigenvectors. 

Multiple Eigenvalues 

Find the eigenvalues and eigenvectors of 


A = 


"-2 

2 

.-1 


2 -3“ 

1 -6 . 
-2 0 * 


Solution . For our matrix, the characteristic determinant gives the characteristic equation 

-A 3 - A 2 4- 21A + 45 = 0. 


The roots (eigenvalues of A) are A x = 5, A 2 = A 3 = -3. To Find eigenvectors, we apply the Gauss elimination 
(Sec. 7.3) to the system (A — AI)x = 0, first with A = 5 and then with A = -3. For A = 5 the characteristic 
matrix is 



‘-7 

2 

-3" 


"-7 

2 

-3" 

A — AI = A — 51 = 

2 

-4 

-6 

It row-reduces to 

0 

24 

7 

-f 


.-1 

-2 

— 5_ 


. 0 

0 

0 . 


Hence it has rank 2. Choosing .r 3 = - 1 we have .v 2 = 2 from — ^.v 2 — ^* 3 = 0 and then .v x * l from 
-7 a*! + lv 2 — 3*3 = 0. Hence an eigenvector of A coresponding to A = 5 is x 1 = [1 2 — 1] T . 

For A ~ -3 the characteristic matrix 



1 

fN 


“1 2 -3“ 

A — AI = A + 31 = 

2 4-6 

row-reduces to 

0 0 0 


04 

1 

T 


.0 0 0. 


Hence it has rank 1 . From x x + 2x 2 — 3 a 3 = 0 we have x x = — 2x 2 + 3x 3 . Choosing x 2 = 1 , * 3 = 0 and 
* 2 = 0 , * 3 = 1 , we obtain two linearly independent eigenvectors of A corresponding to A = -3 [as they must 
exist by (5), Sec. 7.5, with rank = 1 and n = 3 ], 


and 



The order M h of an eigenvalue A as a root of the characteristic polynomial is called the 
algebraic multiplicity of A. The number m K of linearly independent eigenvectors 
corresponding to A is called the geometric multiplicity of A. Thus m A is the dimension of 
the eigenspace corresponding to this A. Since the characteristic polynomial has degree n, 
the sum of all the algebraic multiplicities must equal n. In Example 2 for A = —3 we have 
m x = M x = 2. In general, m x ^ M x , as can be shown. The difference A x = M x — m x is 
called the defect of A. Thus A_ 3 = 0 in Example 2, but positive defects A a can easily occur: 



338 


CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems 


EXAMPLE 3 Algebraic Multiplicity, Geometric Multiplicity. Positive Defect 

The characteristic equation of the matrix 


"0 1 

r 


l-A 11 

A = 

Lo ( 

is 

)J 

det (A - AI) = 

O 

1 

>- 


Hence A = 0 is an eigenvalue of algebraic multiplicity M 0 = 2. But its geometric multiplicity is only w 0 = 1, 
since eigenvectors result from — 0*i 4* x 2 = 0. hence x 2 = 0, in the form [jc x 0] T . Hence for A = 0 the defect 
is A 0 = l. 

Similarly, the characteristic equation of the matrix 


T3 ; 

r 


3 - A 2 

A = 

.0 : 

is 

L 

del (A — AI) — 

0 3 - A 


Hence A = 3 is an eigenvalue of algebraic multiplicity M3 = 2, but its geometric multiplicity is only m 3 = 1, 
since eigenvectors result from 0 jc x + lv 2 = 0 in the form [x x 0] T . H 


EXAMPLE 4 Real Matrices with Complex Eigenvalues and Eigenvectors 

Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have 
complex eigenvalues and eigenvectors. For instance, the characteristic equation of the skew-symmetric matrix 


r 0 n 



-A 1 

> 

II 

0 

1 

is 

del (A - AI) = 

-1 -A 


It gives the eigenvalues A x = i (=V-T), A 2 = — /. Eigenvectors are obtained from — u: x 4- jc 2 = 0 and 
ixi 4 x 2 = 0, respectively, and we can choose jc x = 1 to get 


“ 1 " 
- /_ 


and 



In the next section we shall need the following simple theorem. 


THEOREM 3 


Eigenvalues of the Transpose 

The transpose A T of a square matrix A has the same eigenvalues as A. 


PROOF Transposition does not change the value of the characteristic determinant, as follows from 
Theorem 2d in Sec. 7.7. ■ 

Having gained a first impression of matrix eigenvalue problems, in the next section we 
illustrate their importance with some typical applications. 




1-25 


EIGENVALUES AND EIGENVECTORS 


Find the eigenvalues and eigenvectors of the following 
matrices. (Use the given A or factors.) 



4. 


‘0 

.0 


O' 

0 . 
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26. (Multiple eigenvalues) Find further 2 X 2 and 3 X 3 
matrices with multiple eigenvalues. (See Example 2.) 

27. (Nonzero defect) Find further 2X2 and 3X3 
matrices with positive defect. (See Example 3.) 

28. (Transpose) Illustrate Theorem 3 with examples of 
your own. 

29. (Complex eigenvalues) Show that the eigenvalues of 
a real matrix are real or complex conjugate in pairs. 

30. (Inverse) Show that the inverse A” 1 exists if and only 
if none of the eigenvalues A r , • • • , A* of A is zero, and 
then A" 1 has the eigenvalues l/A^ • • • , 1/A*. 
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8.i Some Applications of Eigenvalue Problems 

In this section we discuss a few typical examples from the range of applications of matrix 
eigenvalue problems, which is incredibly large. Chapter 4 shows matrix eigenvalue 
problems related to ODEs governing mechanical systems and electrical networks. To keep 
our present discussion independent of Chap. 4, we include a typical application of that 
kind as our last example. 


EXAMPLE 1 Stretching of an Elastic Membrane 

An elastic membrane in the *i* 2 -plane with boundary circle *i 2 + * 2 2 = 1 (Fig. 158) is stretched so that a 
point P : (a'x, * 2 ) goes over into the point Q : (y lf y 2 ) given by 


( 1 ) 



in components, 


>’i = 5 a*i + 3*2 
y 2 = 3*! + 5 a 2 . 


Find the principal directions, that is, the directions of the position vector x of P for which the direction of the 
position vector y of Q is the same or exactly opposite. What shape does the boundary circle take under this 
deformation? 


Solution . We are looking for vectors x such that y = Ax. Since y = Ax, this gives Ax = Ax, the equation 
of an eigenvalue problem. In components. Ax = Ax is 


( 2 ) 


5*! + 3a 2 = A*i 
3*! + 5*2 = A*2 


The characteristic equation is 


(3) 


5 - A 
3 


(5 - A)*! + 3*2 = 0 

or 

3*i + (5 - A)* 2 = 0. 


3 

5 - A 


= (5 - A) 2 - 9 = 0. 


Its solutions are A x = 8 and A 2 = 2. These are the eigenvalues of our problem. For A = A x = 8, our system 
(2) becomes 


-3*i + 3*2 = 0, 


Solution * 2 — *i, *i arbitrary. 


3*i - 3*2 = 0. 


for instance, * x = * 2 = 1. 


For A 2 = 2, our system (2) becomes 


3*i + 3*2 — 0, Solution * 2 = — * it * x arbitrary. 

3*i + 3*2 = 0. for instance, *i = 1, * 2 = —I. 

We thus obtain as eigenvectors of A, for instance, [1 11 T corresponding to Ai and [1 -1] T corresponding to 
A 2 (or a nonzero scalar multiple of these). These vectors make 45° and 1 35° angles with the positive * r direction. 
They give the principal directions, the answer to our problem. The eigenvalues show that in the principal 
directions the membrane is stretched by factors 8 and 2, respectively; see Fig. 158. 

Accordingly, if we choose the principal directions as directions of a new Cartesian ^-coordinate system, 
say, with the positive «i-semi-axis in the first quadrant and the positive // 2 -se mi-axis in the second quadrant of 
the *i* 2 -system, and if we set «i = r cos <f > , « 2 = r sin & a boundary point of the unstretched circular 
membrane has coordinates cos </>, sin <f>. Hence, after the stretch we have 


Zi = 8 cos <j), z 2 = 2 sin <£. 


Since cos 2 <f> + sin 2 <f> = 1, this shows that the deformed boundary is an ellipse (Fig. 158) 



( 4 ) 



SEC. 8.2 Some Applications of Eigenvalue Problems 


341 



Fig. 158. Undeformed and deformed membrane in Example 1 

EXAMPLE 2 Eigenvalue Problems Arising from Markov Processes 

Markov processes as considered in Example 13 of Sec. 7.2 lead to eigenvalue problems if we ask for the limit 
state of the process in which the state vector x is reproduced under the multiplication by the stochastic matrix 
A governing the process, that is. Ax = x. Hence A should have the eigenvalue 1 , and x should be a corresponding 
eigenvector. Tliis is of practical interest because it shows the long-term tendency of the development modeled 
by the process. 

In that example. 



i 

o 

p 

o 


'0.7 0.2 o.r 


V 


V 

A = 

0.2 0.9 0.2 

For the transpose. 

0.1 0.9 0 


1 

= 

l 


_0.1 0 0.8. 


. 0 0.2 0.8. 


. 1 . 


.1. 


Hence A T has the eigenvalue 1, and the same is true for A by Theorem 3 in Sec. 8.1. An eigenvector x of A 
for A = 1 is obtained from 



'“0.3 

0.1 

0 “ 


'-3/10 

1/10 

0 ~ 

A - I = 

0.2 

-0.1 

0.2 

, row-reduced to 

0 

-1/30 

1/5 


. 0.1 

0 

-0.2. 


. 0 

0 

0 . 


Taking .v 3 = 1, we get a 2 = 6 from — a 2 /30 -f- ,v 3 /5 = 0 and then x 1 = 2 from -3;q/I0 + ,v 2 /I0 = 0. This 
gives x = [2 6 1] T . It means that in the long run, the ratio Commercial: Industrial: Residential will approach 

2:6:1, provided that the probabilities given by A remain (about) the same. (We switched to ordinary fractions 
to avoid rounding errors.) I 

EXAMPLE 3 Eigenvalue Problems Arising from Population Models. Leslie Model 

The Leslie model describes age-specified population growth, as follows. Let the oldest age attained by the 
females in some animal population be 9 years. Divide the population into three age classes of 3 years each. Let 
the “Leslie matrix” be 




' 0 

2.3 

0.4' 

(5) 

L = [/ jfc ] = 

0.6 

0 

0 



. 0 

0.3 

0 _ 


where is the average number of daughters born to a single female during the time she is in age class k , and 
U = 2. 3) is the fraction of females in age class j - 1 that will survive and pass into class j. (a) What is 
die number of females in each class after 3, 6, 9 years if each class initially consists of 400 females? (b) For 
what initial distribution will the number of females in each class change by the same proportion? What is this 
rate of change? 
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EXAMPLE 4 


Solution . (a) Initially, x} 0) = [400 400 4001. After 3 years. 


x (3) ~ Lx«» ~ 


' 0 
0.6 
_ 0 


2.3 

0 

0.3 



"400" 


"1080" 


400 

= 

240 


.400. 


. 120. 


Similarly, after 6 years the number of females in each class is given by = (Lx C3) ) T = [600 648 72], and 
after 9 years we have xj 9) = (LX( 6) ) T = [1519.2 360 194.4]. 

(b) Proportional change means that we are looking for a distribution vector x such that Lx = Ax, where A 
is the rate of change (growth if A > 1 , decrease if A < 1 ). The characteristic equation is (develop the characteristic 
determinant by the first column) 

det (L - AI) = -A 3 - 0.6(-2.3A - 0.3 • 0.4) = -A 3 + I.38A + 0.072 = 0. 


A positive root is found to be (for instance, by Newton’s method. Sec. 19.2) A = 1 .2. A corresponding eigenvector 
x can be determined from the characteristic matrix 



-1.2 

2.3 

0.4" 



" 1 " 

A - 1.21 = 

0.6 

-1.2 

0 

, say. 

x = 

0.5 


. 0 

0.3 

-1.2. 



.0.125. 


where * 3 = 0.125 is chosen, * 2 = 0.5 then follows from 0.3.v 2 — 1.2x 3 = 0, and = 1 from 
-1.2*! + 2.3*2 + 0.4* 3 = 0. To get an initial population of 1200 as before, we multiply x by 
1200/(1 + 0.5 -1- 0.125) = 738. Answer: Proportional growth of the numbers of females in the three classes 
will occur if the initial values are 738, 369, 92 in classes 1, 2, 3, respectively. The growth rate will be 1.2 per 
3 years. H 


Vibrating System of Two Masses on Two Springs (Fig. 159) 

Mass-spring systems involving several masses and springs can be treated as eigenvalue problems. For instance, 
the mechanical system in Fig. 159 is governed by the system of ODEs 

= -5yi + 2y 2 

( 6 ) 

.V2 = 2)’! - 2 y 2 


where y x and y 2 are the displacements of the masses from rest, as shown in the figure, and primes denote 
derivatives with respect to time t. In vector form, this becomes 




Fig. 159. Masses on springs in Example 4 
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We try a vector solution of the form 

(8) y = xe u \ 

This is suggested by a mechanical system of a single mass on a spring (Sec. 2.4), whose motion is given by 
exponential functions (and sines and cosines). Substitution into (7) gives 

o) 2 xe wt = Axe wt . 

Dividing by e wt and writing (o 2 = A, we see that our mechanical system leads to the eigenvalue problem 

(9) Ax = Ax where A = o> 2 . 

From Example 1 in Sec. 8.1 we see that A has the eigenvalues Aj = — 1 and A 2 = —6. Consequently, 
<o — V— 1 = ±i and A/— 6 = ±i‘V6, respectively. Corresponding eigenvectors are 



From (8) we thus obtain the four complex solutions [see (10), Sec. 2.2) 

x x e ±lt = x^cos t ± i sin t) y 
x 2 e^ = x 2 (cos V6 1 ± i sin V6 /). 

By addition and subtraction (see Sec. 2.2) we get the four real solutions 

Xi cos f, Xi sin u x 2 cos V 6 1 , x 2 sin V6 /. 

A general solution is obtained by taking a linear combination of these, 

y = Xjiaj cos / -I- b x sin /) -I- x 2 (a 2 cos V 6 1 -f b 2 sin V6 /) 

with arbitrary constants b x , a 2 , b 2 (to which values can be assigned by prescribing initial displacement and 
initial velocity of each of the two masses). By ( 10), the components of y are 

>’i = a i cos 1 + b x sin / + 2 a 2 cos V6 / + lb 2 sin V 6 1 
y 2 = 2 cos t + 2 by sin / - a 2 cos V6 1 - b 2 sin Vb t. 

These functions describe harmonic oscillations of the two masses. Physically, this had to be expected because 
we have neglected damping. ■ 


T^l LINEAR TRANSFORMATIONS |7-U[ ELASTIC DEFORMATIONS 

Find the matrix A in the indicated linear transformation Given A in a deformation y = Ax, find the principal 

y = Ax. Explain the geometric significance of the directions and corresponding factors of extension or 

eigenvalues and eigenvectors of A. Show the details. contraction. Show the details. 


1. Reflection about the y-axis in R 2 

2. Reflection about the xy-plane in R 3 

3. Orthogonal projection (perpendicular projection) of R 2 
onto the jc-axis 

4. Orthogonal projection of R 3 onto the plane y = x 

5. Dilatation (uniform stretching) in R 2 by a factor 5 

6. Counterclockwise rotation through the angle 77/2 about 
the origin in R 2 
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“-2 3" 


' 10.5 i/vr 



‘0.6 

0.1 

0.2" 

13. 

. 3 -2. 

14. 

.1/V2 10.0. 


20. 

0.4 

0.1 

0.4 

15. (Leontief 1 input-output model) Suppose that three 


. 0 

0.8 

0.4. 


industries are interrelated so that their outputs are used 
as inputs by themselves, according to the 3 X 3 

consumption matrix 


" 0.2 


0.5 


21-23 


POPULATION MODEL WITH AGE 
SPECIFICATION 

Find the growth rate in the Leslie model (see Example 3) 
with the matrix as given. (Show details.) 


A = [a jk \ = 


0.6 

- 0.2 


0 

0.5 


0.3 

0.7. 


where a jk is the fraction of the output of industry k 
consumed (purchased) by industry j. Let pj be the price 
charged by industry j for its total output. A problem is 
to find prices so that for each industry, total 
expenditures equal total income. Show that this leads 
to Ap = p, where p = [p x p 2 p 3 ] T , and find a 
solution p with nonnegative p x , p 2 , p$- 


16. Show that a consumption matrix as considered in Prob. 
15 must have column sums 1 and always has the 
eigenvalue 1 . 


17. (Open Leontief input-output model) If not the whole 
output but only a portion of it is consumed by the 
industries themselves, then instead of Ax = x (as in 
Prob. 15), we have x - Ax = y, where x = [a j x 2 .v 3 ] t 
is produced, Ax is consumed by the industries, and, thus, 
y is the net production available for other consumers. 
Find for what production x a given demand vector 
y = [0.136 0.272 0.1 36] T can be achieved if the 


consumption matrix is 




" 0.2 

0.4 

0 . 2 ” 

A = 

0.3 

0 

0.1 


. 0.2 

0.4 

0.5. 


18-20 


MARKOV PROCESSES 


Find limit states of the Markov processes modeled by the 
following matrices. (Show the details.) 



‘ 0 

3.45 

0.60' 


21 . 

0.90 

0 

c 




. 0 

0.45 

0 J 



" 0 

12.0 

0 “ 



22 . 

0.75 

0 

0 




. 0 

0.30 

0 . 




" 0 

7.280 


2.975" 

23. 

0.560 

0 


0 



. 0 

0.420 


0 J 


24. TEAM PROJECT. General Properties of 
Eigenvalues and Eigenvectors. Prove the following 
statements and illustrate them with examples of your 
own choice. Here, A T , • • • , A n are the (not necessarily 
distinct) eigenvalues of a given n X n matrix A = [tf j7c ]. 

(a) Trace. The sum of the main diagonal entries is called 
the trace of A. It equals the sum of the eigenvalues. 

(b) “Spectral shift.” A — kl has the eigenvalues 

A x — k n — k and the same eigenvectors as A. 

(c) Scalar multiples, powers. kA has the eigenvalues 
k\ x , • • • , £A„. A w (/// = 1 . 2, • • •) has the eigenvalues 
Aj” 1 , • • • . A rl m . The eigenvectors are those of A. 

(d) Spectral mapping theorem. The “polynomial 
matrix” 

P( A) = k m A m + A” 1 " 1 +■■■ + k,A + k 0 l 


r 

0.4" 


Lo.9 

0 . 6 . 


"0.5 

0.3 

0 . 2 " 

0.3 

0.5 

0.2 

. 0.2 

0.2 

0 . 6 . 


has die eigenvalues 

p(Kj) = k m \j m + k m _ iA/ ,l ~ l + • • * + k t \y + k 0 

where j = I , • ■ * . n, and the same eigenvectors as A. 
(e) Perron’s theorem. Show that a Leslie matrix L with 
positive / 12? / 13 , / 21 , iz 2 has a positive eigenvalue. (This 
is a special case of the famous Perron-Frobenius theorem 
in Sec. 20.7, which is difficult to prove in its general form.) 


1 WASSILY LEONTIEF (1906-1999). American economist at New York University. For his input-output 
analysis he was awarded the Nobel Prize in 1973. 
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8.3 Symmetric, Skew-Symmetric, 
and Orthogonal Matrices 

We consider three classes of real square matrices that occur quite frequently in applications 
because they have several remarkable properties which we shall now discuss. The first 
two of these classes have already been mentioned in Sec. 7.2. 


DEFINITIONS 


Symmetric, Skew-Symmetric, and Orthogonal Matrices 

A real square matrix A = [a jk ] is called 

symmetric if transposition leaves it unchanged, 

(1) A T - A, thus a w = a jk , 

skew-symmetric if transposition gives the negative of A, 

(2) A T = -A, thus a kj = -a jkt 

orthogonal if transposition gives the inverse of A, 

(3) A T = A“ x . 


EXAMPLE 1 Symmetric, Skew-Symmetric, and Orthogonal Matrices 

The matrices 


"-3 

1 

5“ 


" 0 9 

- 12 “ 


r 2 

3 

1 

3 

2“ 

3 

1 

0 

-2 

♦ 

—9 0 

20 

. 

2 

3 

2 

3 

1 

3 

. 5 

-2 

4. 


.12 -20 

0 . 


I 

L 3 

2 

3 

-a 

3j 


are symmetric, skew-symmetric, and orthogonal, respectively, as you should verify. Every skew-symmetric 
matrix has all main diagonal entries zero. (Can you prove this?) H 

Any real square matrix A may be written as the sum of a symmetric matrix R and a 
skew-symmetric matrix S, where 

(4) R = |(A + A t ) and S = |(A - A 1 ). 

EXAMPLE 2 Illustration of Formula (4) 



"9 

5 

1 " 


"9.0 

3.5 

3.5“ 


“ 0 

1.5 

-1.5" 

A = 

2 

3 

-8 

= R + S = 

3.5 

3.0 

-2.0 

-1- 

-1.5 

0 

-6.0 


.5 

4 

3. 


.3.5 

-2.0 

3.0. 


. 1.5 

6.0 

0 . 
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THEOREM 1 


EXAMPLE 3 


THEOREM 2 


Eigenvalues of Symmetric and Skew-Symmetric Matrices 

(a) The eigenvalues of a symmetric matrix are real 

(b) The eigenvalues of a skew-symmetric matrix are pure imaginary or zero. 


This basic theorem (and an extension of it) will be proved in Sec. 8.5. 

Eigenvalues of Symmetric and Skew-Symmetric Matrices 

The matrices in (1) and (7) of Sec. 8.2 are symmetric and have real eigenvalues. The skew-symmetric matrix 
in Example 1 has the eigenvalues 0, —25/, and 25 /. (Verify this.) The following matrix has the real eigenvalues 
1 and 5 but is not symmetric. Does this contradict Theorem 1 ? 


"3 

„1 


Orthogonal Transformations and Orthogonal Matrices 

Orthogonal transformations are transformations 


(5) 


y = Ax where A is an orthogonal matrix. 


With each vector x in R n such a transformation assigns a vector y in R n . For instance, 
the plane rotation through an angle 0 


( 6 ) 



cos 0 
sin 6 


-sin 0‘ 
cos 6 



1 

H 

1 


£ 


is an orthogonal transformation. It can be shown that any orthogonal transformation in 
the plane or in three-dimensional space is a rotation (possibly combined with a reflection 
in a straight line or a plane, respectively). 

The main reason for the importance of orthogonal matrices is as follows. 


Invariance of Inner Product 

An orthogonal transformation preserves the value of the inner product of vectors 
a and b in R n , defined by 

(7) a*b = a T b = [tfi • • • <7 n ] 


That is, for any a and b in R n , orthogonal n X n matrix A, and u = Aa, v = Ab 
we have u # v = a # b. 

Hence the transformation also preserves the length or norm of any vector a in 
R n given by 

(8) || a || — Va*a = Va*a. 
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PROOF 


THEOREM 3 


PROOF 


THEOREM 4 


PROOF 


EXAMPLE 4 


Let A be orthogonal. Let u = Aa and v = Ab. We must show that u*v = a*b. Now 
(Aa) T = a T A T by (lOd) in Sec. 7.2 and A T A = A“ 1 A = I by (3). Hence 

(9) u*v = u T v = (Aa) T Ab = a T A T Ab = a T Ib = a T b = a*b. 

From this the invariance of || a || follows if we set b = a. ■ 

Orthogonal matrices have further interesting properties as follows. 


Orthonormality of Column and Row Vectors 

A real square matrix is orthogonal if and only if its column vectors a lv • * • , a n {and 
also its row vectors ) form an orthonormal system, that is. 


( 10 ) 


V a fe = a / a fc = 


0 

1 


if j * k 
if j = k. 


(a) Let A be orthogonal. Then A 1 A = A T A = I, in terms of column vectors a lf • • • , a n . 


( 11 ) 


1 

H -i 
1 



" a l Ta l 

a i T a 2 • 

• • a i T a»" 

_a» T . 

[ax- 

• - 3 n ] = 

.aja! 

an T a 2 • 

• • slJsl, 

**71 *"71_ 


The last equality implies (10), by the definition of the n X n unit matrix I. From (3) it 
follows that the inverse of an orthogonal matrix is orthogonal (see CAS Experiment 20). 
Now the column vectors of A” 1 (= A T ) are the row vectors of A. Hence the row vectors 
of A also form an orthonormal system. 

(b) Conversely, if the column vectors of A satisfy (10), the off-diagonal entries in (11) 
must be 0 and the diagonal entries 1. Hence A T A = I, as (11) shows. Similarly, AA T = L 
This implies A T = A” 1 because also A -1 A = AA” 1 = I and the inverse is unique. Hence 
A is orthogonal. Similarly when the row vectors of A form an orthonormal system, by 
what has been said at the end of part (a). ■ 


Determinant of an Orthogonal Matrix 

The determinant of an orthogonal matrix has the value +1 or — 1. 


From det AB = det A det B (Sec. 7.8, Theorem 4) and det A T = det A (Sec. 7.7, Theorem 
2d), we get for an orthogonal matrix 

1 = det I = det (AA“ X ) = det (AA T ) = det A det A T = (det A) 2 . ■ 


Illustration of Theorems 3 and 4 

The last matrix in Example 1 and the matrix in (6) illustrate Theorems 3 and 4 because their determinants are 
— 1 and +1, as you should verify. ■ 
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THEOREM 5 Eigenvalues of an Orthogonal Matrix 

The eigenvalues of an orthogonal matrix A are real or complex conjugates in pairs 
and have absolute value 1 . 


PROOF The first part of the statement holds for any real matrix A because its characteristic 
polynomial has real coefficients, so that its zeros (the eigenvalues of A) must be as 
indicated. The claim that |A| = 1 will be proved in Sec. 8.5. ■ 

EXAMPLE 5 Eigenvalues of an Orthogonal Matrix 

The orthogonal matrix in Example 1 has the characteristic equation 

-A 3 + §A 2 + §A - I = 0. 

Now one of the eigenvalues must be real (why?), hence +1 or —1. Trying, we find —1. Division by A + 1 
gives —(A 2 — 5A/3 + l) = 0 and the two eigenvalues (5 + /VTT)/6 and (5 - iVTT)/6, which have absolute 
value 1. Verify all of this. ■ 

Looking back at this section, you will find that the numerous basic results it contains have 
relatively short, straightforward proofs. This is typical of large portions of matrix 
eigenvalue theory. 


PROBLEM SET 8.3 


1. (Verification) Verify the statements in Example 1 . 

2. Verify the statements in Examples 3 and 4. 

3. Are the eigenvalues of A 4- B of the form Aj 4- p jf 
where A and pj are the eigenvalues of A and B, 
respectively? 

4. (Orthogonality) Prove that eigenvectors of a 
symmetric matrix corresponding to different 
eigenvalues are orthogonal. Give an example. 

5. (Skew-symmetric matrix) Show that the inverse of a 
skew-symmetric matrix is skew-symmetric. 

6. Do there exist nonsingular skew-symmetric n X n 
matrices with odd nl 

7. (Orthogonal matrix) Do there exist skew-symmetric 
orthogonal 3X3 matrices? 

8. (Symmetric matrix) Do there exist nondiagonal 
symmetric 3X3 matrices that are orthogonal? 

9-17 1 EIGENVALUES OF SYMMETRIC, SKEW- 
SYMMETRIC, AND ORTHOGONAL 
MATRICES 

Are the following matrices symmetric, skew-symmetric, or 
orthogonal? Find their spectrum (thereby illustrating 
Theorems 1 and 5). (Show the details of your work.) 

0.96 -0.28"1 f a bl 

0.28 0.96 J ’ L -b a\ 



18. (Rotation in space) Give a geometric interpretation of 
the transformation y = Ax with A as in Prob. 12 and 
x and y referred to a Cartesian coordinate system. 

19. WRITING PROJECT. Section Summary. 
Summarize the main concepts and facts in this section, 
with illustrative examples of your own. 
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20. CAS EXPERIMENT. Orthogonal Matrices. 

(a) Products. Inverse. Prove that the product of two 
orthogonal matrices is orthogonal, and so is the inverse 
of an orthogonal matrix. What does this mean in terms 
of rotations? 

(b) Rotation. Show that (6) is an orthogonal 
transformation. Verify that it satisfies Theorem 3. Find 
the inverse transformation. 

(c) Powers. Write a program for computing powers 
A m {m = 1, 2, • • •) of a 2 X 2 matrix A and their 
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spectra. Apply it to the matrix in Prob. 9 (call it A). To 
what rotation does A correspond? Do the eigenvalues 
of A m have a limit as m — * ac? 

(d) Compute the eigenvalues of (0.9A) m , where A is 
the matrix in Prob. 9. Plot them as points. What is their 
limit? Along what kind of curve do these points 
approach the limit? 

(e) Find A such that y = Ax is a counterclockwise 
rotation through 30° in the plane. 


8.4 Eigenbases. Diagonalization. 

Quadratic Forms 

So far we have emphasized properties of eigenvalues. We now turn to general properties 
of eigenvectors. Eigenvectors of an n X n matrix A may (or may not!) form a basis for 
R n . If we are interested in a transformation y = Ax, such an “eigenbasis” (basis of 
eigenvectors) — if it exists — is of great advantage because then we can represent any x in 
R n uniquely as a linear combination of the eigenvectors x 1? • • • , x n , say, 

X = C X X X + c 2 x 2 + • • • + c n x, r 

And, denoting the corresponding (not necessarily distinct) eigenvalues of the matrix A by 
A 1? • • • , A w , we have Axj = A jX j9 so that we simply obtain 

y = Ax = A(c 1 x 1 + • • • + c n x n ) 

(1) = c x Ax x + • • • + c n Ax n 

= C X X X X X + * • • + CnAnXn- 

This shows that we have decomposed the complicated action of A on an arbitrary vector 
x into a sum of simple actions (multiplication by scalars) on the eigenvectors of A. This 
is the point of an eigenbasis. 

Now if the n eigenvalues are all different, we do obtain a basis: 


THEOREM 1 


Basis of Eigenvectors 

If an n X n matrix A has n distinct eigenvalues , then A has a basis of eigenvectors 
x x , • • • , x n for R n . 


PROOF All we have to show is that x x , • • • , x n are linearly independent. Suppose they are not. 

Let r be the largest integer such that [x x , • • • , x r } is a linearly independent set. Then 
r < n and the set {x x , • • • ,x r , x r+1 } is linearly dependent. Thus there are scalars 
Ci, • • • , c r+1 , not all zero, such that 

(2) c x x x + • • • + c r+1 x,. +1 = 0 

(see Sec. 7.4). Multiplying both sides by A and using AXj = we obtain 

(3) Ci^i x i + * * • + c r+1 A r+1 x r+ j = 0. 
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EXAMPLE 1 


THEOREM 2 


EXAMPLE 2 


DEFINITION 


To get rid of the last term, we subtract \ r+x times (2) from this, obtaining 

^r+l)^l “h * * * C r (Aj. A r+1 )X r 0* 

Here c x ( A x - A r+1 ) = 0, • • • , c r (A r - A,. +1 ) = 0 since {jc 1s • • • , *,.} is linearly independent. 
Hence c x = • • • = c r = 0, since all the eigenvalues are distinct. But with this, (2) reduces 
to c r+1 x r+1 = 0, hence c r+1 = 0, since x r+1 # 0 (an eigenvector!). This contradicts the fact 
that not all scalars in (2) are zero. Hence the conclusion of the theorem must hold. ■ 

Eigenbasis. Nondistinct Eigenvalues. Nonexistence 

[5 31 

The matrix A = has a basis of eigenvectors 

L3 5j 

Ai = 8, A 2 = 2. (See Example 1 in Sec. 8.2.) 

Even if not all n eigenvalues are different, a matrix A may still provide an eigenbasis for R n . See Example 
2 in Sec. 8.1, where n = 3. 

On the other hand, A may not have enough linearly independent eigenvectors to make up a basis. For 
instance. A in Example 3 of Sec. 8. 1 is 

r° n 

A = and has only one eigenvector 

Lo oj 

Actually, eigenbases exist under much more general conditions than those in Theorem 1. 
An important case is the following. 


~k 

o 


( k ¥= 0, arbitrary). 


1 1 

* 

1- -L 


corresponding to the eigenvalues 


Symmetric Matrices 

A symmetric matrix has an orthonormal basis of eigenvectors for R n . 


For a proof (which is involved) see Ref. [B3], vol. 1, pp. 270-272. 

Orthonormal Basis of Eigenvectors 

The first matrix in Example 1 is symmetric, and an orthonormal basis of eigenvectors is [l/V2 1/V2] T , 

[l/V2 -1A/2] T ■ 

Diagonalization of Matrices 

Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are 
the eigenvalues of A. This is done by a “similarity transformation,” which is defined as 
follows (and will have various applications in numerics in Chap. 20). 


Similar Matrices. Similarity Transformation 

An n X n matrix A is called similar to an n X n matrix A if 

(4) A = P _1 AP 

for some (nonsingular!) n X n matrix P. This transformation, which gives A from 
A, is called a similarity transformation. 
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THEOREM 3 


PROOF 


EXAMPLE 3 


THEOREM 4 


The key property of this transformation is that it preserves the eigenvalues of A: 


Eigenvalues and Eigenvectors of Similar Matrices 

If A is similar to A, then A has the same eigenvalues as A. 

Furthermore , if x is an eigenvector of A, then y = P -1 x is an eigenvector of A 
corresponding to the same eigenvalue. 


From Ax = Ax (A an eigenvalue, x # 0) we get P *Ax = AP x x. Now I = PP \ By 
this “identity trick" the previous equation gives 


P“ x Ax = P _1 AIx = P^APP^x = A(P _1 x) = AP -1 x. 

Hence A is an eigenvalue of A and P~*x a corresponding eigenvector. Indeed, P -1 x = 0 
would give x = lx = PP _1 x = P0 = 0, contradicting x ^ 0. ■ 


Eigenvalues and Vectors of Similar Matrices 

f6 -31 

Let A = I I an 

L4 -ij 


Then 


- 1 '■[! -] 

■[.:*:][: :3c h ;i 


Here P“ l was obtained from (4*) in Sec. 7.8 with detP = 1. We see that A has the eigenvalues \ x = 3, 
A 2 = 2. The characteristic equation of A is (6 - A)(— I — A) + 12 = A 2 — 5 A + 6 — 0. It has the roots (the 
eigenvalues of A) \ x = 3, A 2 - 2, confirming the first part of Theorem 3. 

We confirm the second part. From the first component of (A — AI)x = 0 we have (6 — A).vx — 3 a * 2 = 0. 
For A = 3 this gives 3*i - 3 x 2 = 0, say, x x = [1 1 ] T . For A = 2 it gives 4.vx - 3 a * 2 = 0, say, x 2 = [3 4] T . 

Tn Theorem 3 wc thus have 


yi = P = 



y 2 = P *x 2 = 



Indeed, these are eigenvectors of the diagonal matrix A. 

Perhaps we see that x x and x 2 are the columns of P. This suggests the general method of transforming a 
matrix A to diagonal form D by using P — X, the matrix with eigenvectors as columns: H 


Diagonalization of a Matrix 

If an n X n matrix A has a basis of eigenvectors, then 
(5) D = X^AX 

is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X 
is the matrix with these eigenvectors as column vectors. Also, 

(5*) D TO = X -1 A m X (m = 2, 3, • • •)• 
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PROOF 


EXAMPLE 4 


Let x ls • • • , x* constitute a basis of eigenvectors of A for R n . Let the corresponding 
eigenvalues of A be A 1# • • • , A^, respectively, so that Ax x = AjX^ • • • , Ax„. = A n x n . 
Then X = [x x • • • x n ] has rank n , by Theorem 3 in Sec. 7.4. Hence X” 1 exists by 
Theorem 1 in Sec. 7.8. We claim that 


(6) AX = A[x x • • • x n ] = [Ax x • • • Ax*] = [A x x x • • • A^xJ = XD 


where D is the diagonal matrix as in (5). The fourth equality in (6) follows by direct 
calculation. (Try it for n = 2 and then for general n.) The third equality uses Ax k = A fc x fc . 
The second equality results if we note that the first column of AX is A times the first 
column of X, and so on. For instance, when n = 2 and we write x x = [jc x1 jc 2 i] t , 
*2 = [*12 *22] T > we have 


AX = A[jc x x 2 ] = 



a l2 

a 22j 


X 11 

l x 21 


*12 

* 22 . 




fl ll*ll + a 12 x 21 


a 21 x ll *+* a 22 x 21 


a ll*12 + ^ 12*22 ”1 

= [Ax 1 Ax 2 J. 

a 21 X 12 "b a 22 x 22_ 


Column l 


Column 2 


If we multiply (6) by X” 1 from the left, we obtain (5). Since (5) is a similarity 
transformation, Theorem 3 implies that D has the same eigenvalues as A. Equation (5*) 
follows if we note that 


D 2 = DD = X“ 1 AXX“ 1 AX = X“ x AAX = X“ x A 2 X, etc. ■ 


Diagonalization 

Diagonalize 



” 7.3 

0.2 

—3.7” 

A = 

— 1 1.5 

1.0 

5.5 


. 17.7 

1.8 

—9.3. 


Solution . The characteristic determinant gives the characteristic equation - A 3 - A 2 + 12A = 0. The roots 
(eigenvalues of A) are A x = 3, A 2 = -4, A 3 = 0. By the Gauss elimination applied to (A - AI)x = 0 with 
A = Aj, A 2 , A 3 we find eigenvectors and then X” 1 by the Gauss-Jordan elimination (Sec. 7.8. Example 1). The 
results are 


”-r 


” r 


" 2“ 


”-l 

1 

2“ 


”—0.7 

0.2 

0.3” 

3 


-i 

* 

1 

, X = 

3 

-1 

1 

. X -1 = 

-1.3 

-0.2 

0.7 

1_ 


- 3. 


.4. 


.-1 

3 

4. 


. 0.8 

0.2 

-0.2. 


Calculating AX and multiplying by X 1 from the left, we thus obtain 


D = X“ l AX = 


”—0.7 0.2 0.3” 


0 

1 
1 


U) 

o 

o 

-1.3 -0.2 0.7 


9 4 0 

= 

0 

1 

o 

. 0.8 0.2 -0.2. 


.-3 -12 0. 


o 

o 

o 
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EXAMPLE 5 


Quadratic Forms. Transformation to Principal Axes 

By definition, a quadratic form Q in the components x l5 • • • , x n of a vector x is a sum 
of n z terms, namely, 


n n 

Q — X Ax — * tykXjXk 

j=lfc=l 


= a u ^! 2 + Cl 12X1X2 + - ■ ■ + “lnXiXn 
+ a 2 iX 2 Xi + CI22X2 2 + ‘ * * + ci 2n x 2 x n 

4- 

d" ^n2-*n-*2 + • • • + ^nn-^n • 


A = is called the coefficient matrix of the form. We may assume that A is symmetric, 
because we can take off-diagonal terms together in pairs and write the result as a sum of 
two equal terms; see the following example. 


Quadratic Form. Symmetric Coefficient Matrix 

Let 

x t Ax = [xi x 2 ]\ I \ \ = 3 x ± 2 4 4a 1 .v 2 4* 6.v 2 a’i 4 2jc 2 2 = 3a'i 2 4 IOa^ 4 2a 2 2 . 

16 2] L^J 

Here 446 = 10 = 54 5, From the corresponding symmetric matrix C = where cj k = 4 a k j), 

thus c n = 3, c 12 = c 21 = 5, c 22 = 2, we get the same result; indeed. 


x T Cx = [x L 




= 3,V! 2 4 SxjAfc 4 5 a 2 a 1 4 2a* 2 2 = 3a x 2 4 10 a 1 a: 2 4 2a* 2 2 H 


Quadratic forms occur in physics and geometry, for instance, in connection with conic 
sections (ellipses x 2 la 2 + x 2 tb 2 = 1, etc.) and quadratic surfaces (cones, etc.). Their 
transformation to principal axes is an important practical task related to the diagonalization 
of matrices, as follows. 

By Theorem 2 the symmetric coefficient matrix A of (7) has an orthonormal basis of 
eigenvectors. Hence if we take these as column vectors, we obtain a matrix X that is 
orthogonal, so that X" 1 = X T . From (5) we thus have A = XDX“ x = XDX T . Substitution 
into (7) gives 


( 8 ) 


Q = x T XDX T x. 


If we set X T x = y, then, since X T = X 1 , we get 


(9) x = Xy. 

Furthermore, in (8) we have x T X = (X T x) T = y T and X T x = y, so that Q becomes simply 
(!°) Q = y T Dy = 4yi 2 + A 2 y 2 2 + • • • + A n y n 2 . 
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THEOREM 5 


EXAMPLE 6 


This proves the following basic theorem. 


Principal Axes Theorem 

The substitution (9) transforms a quadratic form 

n n 

Q = x t Ax = 22 OjkXjX k K> = a jk ) 

j=l k=l 

to the principal axes form or canonical form (10), where A l5 * * * , are the {not 
necessarily distinct) eigenvalues of the {symmetric!) matrix A, and X is an 
orthogonal matrix with corresponding eigenvectors x x , • • • , x^, respectively , as 
column vectors . 


Transformation to Principal Axes. Conic Sections 

Find out what type of conic section the following quadratic form represents and transform it to principal 
axes: 

Q = 17.Vj 2 - 30.rj.V2 + 17.v 2 2 = 128. 


Solution . We have Q = x T Ax, where 



This gives the characteristic equation (17 - A) 2 - 15 2 = 0. It has the roots Aj = 2. A 2 = 32. Hence (10) 
becomes 

Q = 2v, 2 + 32 v 2 2 . 


We see that Q- 128 represents the ellipse 2yj 2 + 32y 2 2 = 128, that is. 



= 1 . 


If we want to know the direction of the principal axes in the AjA^-coordinates, we have to determine normalized 
eigenvectors from (A - AI)x = 0 with A = Aj = 2 and A = A 2 = 32 and then use (9). We get 

r^i - r ,A J|. 

L1/V2J L 1/V2J 


hence 




'I/V 2 

-1/V2I IV 

■Vj = y,/V 2 - y 2 /V 2 

.I/V2 

1/V2J UJ ’ 

x 2 = y^V 2 + y 2 N 2 . 


This is a 45° rotation. Our results agree with those in Sec. 8.2. Example 1, except for the notations. See also 
Fig. 158 in that example. ■ 
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5-E- T 8 . 4 - 


[mT| diagonalization of matrices 

Find an eigenbasis (a basis of eigenvectors) and 
diagonalize. (Show the details.) 



’3 

2' 

■ 



“0 

16 1 


1. 

.2 

6 

- 


2. 

.4 

oJ 



‘5 

1' 




' 3 

2 1 


3. 

_1 

5. 



4. 

5 

-J 



"1.0 


6.01 



"2 

7 1 


5. 

.1.5 


J 


6. 

.6 - 

J 



"1 

0 

r 



’-6 

-6 

10" 

7. 

0 

3 

2 


8. 

-5 

-5 

5 


_0 

0 

2. 



_— 9 

-9 

13. 


3 


10 

-15“ 





9. 

-18 


39 

9 






24 


40 

-15. 






10. (Orthonormal basis) Illustrate Theorem 2 with further 
examples. 

11. (No basis) Find further 2X2 and 3X3 matrices 
without eigenbases. 

12. PROJECT. Similarity of Matrices. Similarity is 
basic, for instance in designing numeric methods. 

(a) Trace. By definition, the trace of an n X n matrix 
A = [aj k ] is the sum of the diagonal entries, 

trace A = fl n + % + ••• + ct nn . 

Show that the trace equals the sum of the eigenvalues, 
each counted as often as its algebraic multiplicity 
indicates. Illustrate this with the matrices in Probs. 1, 
3, 5, 7, 9. 

(b) Trace of product. Let B = [bj k ] be n X n. Show 
that similar matrices have equal traces, by first 
proving 


trace AB = 2) 2 a a^u = trace BA. 

i=l 1=1 

(c) Find a relationship between A in (4) and A = PAP*" 1 . 


(d) Diagonalization. What can you do in (5) if you 
want to change the order of the eigenvalues in D, for 
instance, interchange d u = A x and d 2 2 = A 2 ? 


1 13— 18| SIMILAR MATRICES HAVE EQUAL 
SPECTRA 

Verify this for A and A = P“ 1 AP. Find eigenvectors y of 
A. Show that x = Py are eigenvectors of A. (Show the 
details of your work.) 


13. A = 

14. A = 

15. A = 

16. A = 




2 

1 




2 " 

4. 



“ 4 

0 

0" 



‘4 

0 

6“ 

17. A = 

12 

-2 

0 

,p = 

0 

2 

0 


.21 

-6 

1 . 



_6 

0 

10_ 


"-5 

0 

15" 



"0 

1 

0" 

18. A = 

3 

4 

-9 

,p = 


1 

0 

0 


.-5 

0 

15. 



.0 

0 

1 . 


19-28 


TRANSFORMATION TO PRINCIPAL AXES. 
CONIC SECTIONS 


What kind of conic section (or pair of straight lines) is given 
by the quadratic form? Transform it to principal axes. 
Express x T = [x x x 2 ] in terms of the new coordinate vector 
y T = [y t y 2 ], as in Example 6. 

19. Jtj 2 + 24x^2 - 6x z 2 = 5 

20. 3x, 2 + 4V3x x x 2 + 7x 2 2 = 9 

21. 3JC, 2 — 8jcjAT 2 — 3jc 2 2 = 0 

22 . 6 *i 2 + l 6 .rj.V 2 — 6jc 2 2 = 20 

23. 4x x 2 + 2V3XXXZ + 2x 2 z = 10 

24. lx 2 - 74x x x 2 = 144 

25. .Vj 2 — 1 2 Xj x 2 + x 2 = 35 
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26. 3x x 2 + 22x 1 x 2 + 3 . v 2 2 = 0 

27. 12V + 32a-!A 2 + 12* 2 2 =112 

28. 6.5aj 2 + 5.0avc 2 4* 6.5x 2 2 = 36 

29. (Definiteness) A quadratic form Q{x) = x T Ax and its 
(symmetric!) matrix A are called (a) positive definite 
if Q{x) > 0 for all x ^ 0, (b) negative definite if 
Q(x) < 0 for all x ^ 0, (c) indefinite if (2W takes 
both positive and negative values. (See Fig. 1 60.) [Q(x) 
and A are called positive semidefmite {negative 
semidefinite) if Q{x) ^ 0 (Q(x) ^ 0) for all x.] A 
necessary and sufficient condition for positive 
definiteness is that all the “principal minors” are 
positive (see Ref. [B3], vol. 1, p. 306), that is, 


a n > 0 , 


flu 

a 12 


a 12 

a 22 


> 0 , 


Q(x) 



(a) Positive definite form 


Q(x ) 



a ll a l2 a 13 
a 12 a 22 a 23 

a 13 a 23 a 33 


> 0 , 


det A > 0. 


Show that the form in Prob. 23 is positive definite, 
whereas that in Prob. 19 is indefinite. 

30. (Definiteness) Show that necessary and sufficient for 
(a), (b), (c) in Prob. 29 is that the eigenvalues of A are 
(a) all positive, (b) all negative, (c) both positive and 
negative. Hint Use Theorem 5. 


( b ) Negative definite form 


Q(x) 



(c) Indefinite form 

Fig. 160. Quadratic forms in two variables 


8.! Complex Matrices and Forms. Optional 

The three classes of real matrices in Sec. 8.3 have complex counterparts that are of practical 
interest in certain applications, mainly because of their spectra (see Theorem 1 in this 
section), for instance, in quantum mechanics. To define these classes, we need the 
following standard 


Notations 

A = [d jk ] is obtained from A = [a jk ] by replacing each entry ci jk = a 4- i(3 
( a , (3 real) with its complex conjugate dj k = a — ip. Also, A T = [d k j] is the transpose 
of A, hence the conjugate transpose of A. 


EXAMPLE 1 Notations 


"3 + 4/ 

1 - / “ 

f3 - 4/ 

i +/ 1 

_ n- 4 i 

6 1 

If A = 


, then A = 

and A = 1 

. ■ 

_ 6 

2 - 5/J ! 

L 6 

2 + 5/J 

Li + i 

2 + 5/J 
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DEFINITION 


EXAMPLE 2 


Hermitian, Skew-Hermitian, and Unitary Matrices 


A square matrix A = [a^] is called 


Hermitian if A T = A, that is, 

a kj a jk 

skew-Hermitian if A T = —A, that is. 


unitary if A T = A^ 1 . 



The first two classes are named after Hermite (see footnote 13 in Problem Set 5.8). 

From the definitions we see the following. If A is Hermitian, the entries on the main 
diagonal must satisfy = a^\ that is, they are real. Similarly, if A is skew-Hermitian, 
then Ojj = —cijj. If we set — a + t/3, this becomes a — i/3 = —(a + i(3). Hence 
a = 0, so that cijj must be pure imaginary or 0. 

Hermitian, Skew-Hermitian, and Unitary Matrices 


" 4 

i - 3 r 

r 3/ 2 + n 

B = 

C = 

i 

<, 
i 

.1 + 3/ 

i _ 

2 + ; -/ 


jV3 \i J 


are Hermitian, skew-Hermitian, and unitary matrices, respectively, as you may verify by using the definitions. I 

If a Hermitian matrix is real, then A T = A T = A. Hence a real Hermitian matrix is a 
symmetric matrix (Sec. 8.3.). _ 

Similarly, if a skew-Hermitian matrix is real, then A = A = —A. Hence a real 
skew-Hermitian matrix is a skew-symmetric matrix. 

Finally, if a unitary matrix is real, then A T = A T = A"" 1 . Hence a real unitary matrix 
is an orthogonal matrix. 

This shows that Hermitian , skew-Hermitian, and unitary matrices generalize symmetric, 
skew-symmetric, and orthogonal matrices, respectively. 

Eigenvalues 

It is quite remarkable that the matrices under consideration have spectra (sets of eigenvalues; 
see Sec. 8.1) that can be characterized in a general way as follows (see Fig. 161). 



Fig. 161 . Location of the eigenvalues of Hermitian, 
skew-Hermitian, and unitary matrices in the complex A-plane 
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THEOREM 1 


EXAMPLE 3 


PROOF 


Eigenvalues 

(a) The eigenvalues of a Hermitian matrix (and thus of a symmetric matrix) are 
real. 

(b) The eigenvalues of a skew-Hermitian matrix (and thus of a skew-symmetric 
matrix) are pure imaginary or zero. 

(c) The eigenvalues of a unitary matrix (and thus of an orthogonal matrix) have 
absolute value 1 . 


Illustration of Theorem 1 

For the matrices in Example 2 we find by direct calculation 



Matrix 

Characteristic Equation 

Eigenvalues 

A 

Hermitian 

A 2 - 11A + 18 = 0 

9. 2 

B 

Skew-Hermitian 

A 2 - 2/A + 8 = 0 

4 /, —2/ 

C 

Unitary 

o 

II 

1 

1 

|V3 + \i, ~^V3 + \i 


and |±±V3 + i/f = |+4=l. ■ 

We prove Theorem 1. Let A be an eigenvalue and x an eigenvector of A. Multiply Ax = 
Ax from the left by x T , thus x T Ax = Ax T x, and divide by x T x = x x x x + •••-!- x n x n = 
\x x \ 2 + • ■ • + |x n | 2 , which is real and not 0 because x ^ 0. This gives 



(a) If A is Hermitian, A T = A or A T = A and we show that then the numerator in (1) is 
real, which makes A real. x T Ax is a scalar; hence taking the transpose has no effect. Thus 

(2) x T Ax = (x t Ax) t = x T A T x = x T Ax = (x T Ax). 

Hence, x T Ax equals its complex conjugate, so that it must be real, (a 4- ib = a — ib 
implies b = 0.) _ 

(b) If A is skew-Hermitian, A T = —A and instead of (2) we obtain 

(3) x t Ax = — (x T Ax) 

so that x t Ax equals minus its complex conjugate and is pure imaginary or 0. 
(a + ib = —(a — ib) implies a = 0.) 

(c) Let A be unitary. We take Ax = Ax and its conjugate transpose 

(Ax) t = (Ax) t = Ax t 
and multiply the two left sides and the two right sides, 

(Ax) t Ax = AAx t x = |A| 2 x t x. 
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THEOREM 2 


PROOF 


DEFINITION 


THEOREM 3 


But A is unitary, A T = A*" 1 , so that on the left we obtain 

(Ax) t Ax = x T A T Ax = x t A -1 Ax = x T Ix = x T x. 

Together, x T x = |A| 2 x t x. We now divide by x T x 0) to get |A| 2 = 1. Hence |A| = 1. 
This proves Theorem 1 as well as Theorems 1 and 5 in Sec. 8.3. ■ 

Key properties of orthogonal matrices (invariance of the inner product, orthonormality of 
rows and columns; see Sec. 8.3) generalize to unitary matrices in a remarkable way. 

To see this, instead of R n we now use the complex vector space C n of all complex 
vectors with n complex numbers as components, and complex numbers as scalars. For 
such complex vectors the inner product is defined by (note the overbar for the complex 
conjugate) 

(4) a*b = a T b. 

The length or norm of such a complex vector is a real number defined by 

(5) ||a|| = Va*a = Va" T a = y/a x a x + • • • + d^a^ = VlaJ 2 + • • • + |« n | 2 . 


Invariance of Inner Product 

A unitary transformation, that is , y = Ax with a unitary matrix A, preserves the 
value of the inner product (4), hence also the norm (5). 


The proof is the same as that of Theorem 2 in Sec. 8.3, which the theorem generalizes. 
In the analog of (9), Sec. 8.3, we now have bars, 

u* v = u T v = (Aa) T Ab = a T A T Ab = a T Ib = a T b = a*b. ■ 

The complex analog of an orthonormal systems of real vectors (see Sec. 8.3) is defined 
as follows. 


Unitary System 

A unitary system is a set of complex vectors satisfying the relationships 


( 6 ) 


[0 if j*k 

V a fc = a / a fc = 

[l if j = k. 


Theorem 3 in Sec. 8.3 extends to complex as follows. 


Unitary Systems of Column and Row Vectors 

A complex square matrix is unitary if and only if its column vectors ( and also its 
row vectors) form a unitaiy system. 
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PROOF 


THEOREM 4 


PROOF 


EXAMPLE 4 


THEOREM 5 


EXAMPLE 5 


The proof is the same as that of Theorem 3 in Sec. 8.3, except for the bars required in 
A t = A -1 and in (4) and (6) of the present section. ■ 


Determinant of a Unitary Matrix 

Let A be a unitary matrix . Then its determinant has absolute value one , that is, 
|det A| = 1. 


Similarly as in Sec. 8.3 we obtain 

1 = det (AA -1 ) = det (AA T ) = det A det A T = det A det A 
= det A det A = |det A| 2 . 

Hence |det A| = 1 (where det A may now be complex). ■ 

Unitary Matrix Illustrating Theorems 1c and 2-4 

For the vectors a T = [2 — /) and b T = [1 + / 4/J we get a T = [2 /] T and a T b = 2(1 + 0 — 4 = -2 + 2/ 
and with 


"0.8/ 

0.6 ’ 



/ 


'-0.8 + 3.2 f 

A = 


also 

Aa = 


and 

Ab = 

.0.6 

0.8/. 



.2. 


2.6 4- 0.6 /_ 


as one can readily verify. This gives (Aa) T Ab = -2 + 2 i, illustrating Theorem 2. The matrix is unitary. Its 
columns form a unitary system, 

a, T a, = -0.8/ • 0.8/ + 0.6 2 = 1 . a/aa = -0.8/ • 0.6 + 0.6 • 0.8/ = 0. 
a2 T a 2 = 0.6 2 + (— 0.8/)0.8/ = I 

and so do its rows. Also, det A = —1. The eigenvalues arc 0.6 + 0.8/ and -0.6 + 0.8/, with eigenvectors 
[1 11 T and [I —Irrespectively. H 

Theorem 2 in Sec. 8.4 on the existence of an eigenbasis extends to complex matrices as 
follows. 


Basis of Eigenvectors 

A Hermitian , skew-Hermitian, or unitary matrix has a basis of eigenvectors for C 1 
that is a unitary system. 


For a proof see Ref. [B3], vol. 1, pp. 270-272 and p. 244 (Definition 2). 

Unitary Eigenbases 

The matrices A, B, C in Example 2 have the following unitary systems of eigenvectors, as you should verify. 


A: V35 [l 3/ 3]T (A = 9) ’ 

VT4 [ ' 

- 3/ — 2] t (A = 2) 

B: vW [1 " 2/ _5]T (A = ~ 2i) - 

v1o [5 

1 + 2/] t (A = 4/) 

C: ^[1 1] T (A = |(/+V3)), 

^2 11 

-1] T (A = |(/-V3)). 
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Hermitian and Skew-Hermitian Forms 

The concept of a quadratic form (Sec. 8.4) can be extended to complex. We call the 
numerator x T Ax in (1) a form in the components x l9 • • • ,x n of x* which may now be 
complex. This form is again a sum of n 2 terms 

n n 

X T AX = 2) 2 a jkXj x k 
j=l/c=l 


— 4* * * * "b 

(7) 

+ 021 * 2*1 4 + a 2n x 2 x n 

+ 

4* iX n X\ 4“ * • • 4* a nn x n x n . 

A is called its coefficient matrix. The form is called a Hermitian or skew-Hermitian 
form if A is Hermitian or skew-Hermitian, respectively. The value of a Hermitian form 
is real, and that of a skew-Hermitian form is pure imaginary or zero. This can be seen 
directly from (2) and (3) and accounts for the importance of these forms in physics. Note 
that (2) and (3) are valid for any vectors because in the proof of (2) and (3) we did not 
use that x is an eigenvector but only that x T x is real and not 0. 


EXAMPLE 6 Hermitian Form 

For A in Example 2 and, say, x = [1 + / 5/] T we get 


_ r 4 1-3/1 r I + n [”4(1 +/) + (! — 3 /) • 5/” 

x t Ax = [1 - / - 5 /J - [I - / - 5 /] 

.1 + 3 / 7 J L 5 ? J LO + 30(1 + 0 + 7 - 5 / 


223 . 


Clearly, if A and x in (4) are real, then (7) reduces to a quadratic form, as discussed in 
the last section. 




1. (Verification) Verify the statements in Examples 2 
and 3. 

2. (Product) Show (BA) 7 = -AB for A and B in 
Example 2. For any n X n Hermitian A and 
skew-Hermitian B. 

3. Show that (ABC) 7 = -C -1 BA for any n X n 
Hermitian A, skew-Hermilian B, and unitary C. 

4. (Eigenvectors) Find eigenvectors of A, B> C in 
Examples 2 and 3. 


1 

i 

V2 

vf 

i 

i 

V2 

V2 



0 0 " 
0 Si 


[5-11 1 EIGENVALUES AND EIGENVECTORS 

Are the matrices in Probs. 5-11 Hermitian? Skew- 


L 0 5i 0 J 


Hermitian? Unitary? Find their eigenvalues (thereby 
verifying Theorem 1) and eigenvectors. 

■ 0 

1 + / 

0 “ 

r 4 n 

5. 

T 0 2/1 

6. 

10. 

1 - / 

0 

1 + i 

_-i 2. 

In o. 


. 0 

1 - / 

0 . 


r 

o_ 
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11 . 


'0 
0 
_ i 


0 

/ 

0 


i 

0 

0. 


12. PROJECT. Complex Matrices 

(a) Decomposition. Show that any square matrix may 
be written as the sum of a Hermitian and a 
skew-Hermitian matrix. Give examples. 

(b) Normal matrix. This important concept denotes 
a matrix that commutes with its conjugate transpose, 

AA T = A T A. Prove that Hermitian, skew-Hermitian, 
and unitary matrices are normal. Give corresponding 
examples of your own. 

(c) Normality criterion. Prove that A is normal if and 
only if the Hermitian and skew-Hermitian matrices in 
(a) commute. 

(d) Find a simple matrix that is not normal. Find a 
normal matrix that is not Hermitian, skew-Hermitian, 
or unitary. 

(e) Unitary matrices. Prove that the product of two 
unitary n X n matrices and the inverse of a unitary 
matrix are unitary. Give examples. 

(f) Powers of unitary matrices in applications may 
sometimes be very simple. Show that C 12 = I in 
Example 2. Find further examples. 


13-15 


COMPLEX FORMS 


Is the given matrix (call it A) Hermitian or skew-Hermitian? 
Find x T Ax. (Show all the details.) a , b , c, k are real. 


r ° 

-3/“ 


f4 + il 


13. 


, x = 




L— 3/ 

0 _ 


l_3 - 

il 


r « 


b + ic~ 




14. 



, X = 

= 


u - 

ic 

k _ 



x 2 

r 2 

1 

+ n 


rn 

15. 


L 

X = 



Li- 

i 

i J 


L2/J 


16. (Pauli spin matrices) Find the eigenvalues and 
eigenvectors of the so-called Pauli spin matrices and show 
that S x S y = iS„ S y S x = -/s 2> S* 2 = S 2 = S 2 = I, 
where 



CHAPTER 8 REVIEW QUESTIONS AND PROBLEMS 


1. In solving an eigenvalue problem, what is given and 
what is sought? 

2. Do there exist square matrices without eigenvalues? 
Eigenvectors corresponding to more than one 
eigenvalue of a given matrix? 

3. What is the defect? Why is it important? Give examples. 

4. Can a complex matrix have real eigenvalues? Real 
eigenvectors? Give reasons. 

5. What is diagonalization of a matrix? Transformation of 
a form to principal axes? 

6. What is an eigenbasis? When does it exist? Why is it 
important? 

7. Does a 3 X 3 matrix always have a real eigenvalue? 

8. Give a few typical applications in which eigenvalue 
problems occur. 


9-14 


DIAGONALIZATION 


Find an eigenbasis and diagonalize. (Show the details.) 


9 . 




10 1 
-144 







Summary of Chapter 8 
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15-17 


SIMILARITY 


Verify that A and A = 
Here, A, P are: 


P *AP have the same spectrum. 



"3.8 

2.4"l 

P 

2 1 



15. 

.2.4 

0 . 2 J 

•L 

J 




"-22 

20 

10 “ 


"1 

0 

2 " 

16. 

-4 

20 

-8 

* 

0 

2 

4 


. 28 

-14 

29. 


_2 

8 

0. 



"2 

2 

- 2 “ 


“1 

2 

3 “ 

17 . 

3 

1 

-3 

* 

2 

0 

2 


.1 

-l 

- 1 . 


.3 

2 

4 . 


Transformation to Canonical Form. Reduce the quadratic 
form to principal axes. 

18. 11.56 a*] 2 + 20. \6.\\x 2 + I7.44a* 2 2 = 100 

19. 1.09a*] 2 - 0.06a]A* 2 + I.OlA'a 2 = 1 

20. 1 4 a *] 2 4 - 24a* 1 a* 2 — 4a* 2 2 = 20 


SUMMARY QF CHAPTER 8 

Linear Algebra: Matrix Eigenvalue Problems 


The practical importance of matrix eigenvalue problems can hardly be overrated. 
The problems are defined by the vector equation 

(1) Ax = Ax. 


A is a given square matrix. All matrices in this chapter are square . A is a scalar. To 
solve the problem (l) means to determine values of A, called eigenvalues (or 
characteristic values) of A, such that (1) has a nontrivial solution x (that is, 
x =£ 0), called an eigenvector of A corresponding to that A. An n X n matrix has 
at least one and at most n numerically different eigenvalues. These are the solutions 
of the characteristic equation (Sec. 8.1) 


(2) D{ A) = det (A - AI) = 


A a 12 
#21 #22 ^ 

#nl #w2 


#ln 
#2 n 


#nn ^ 


= 0. 


D( A) is called the characteristic determinant of A. By expanding it we get the 
characteristic polynomial of A, which is of degree n in A. Some typical applications 
are shown in Sec. 8.2. 

Section 8.3 is devoted to eigenvalue problems for symmetric (A T = A), 
skew-symmetric (A T = —A), and orthogonal matrices (A T = A -1 ). Section 8.4 
concerns the diagonalization of matrices and the transformation of quadratic forms 
to principal axes and its relation to eigenvalues. 

Section 8.5 extends Sec. 8.3 to the complex^ analogs of those real matrices, 
called Hermidan (A T = A), skew-Hermitian (A T = —A), and unitary matrices 
(A = A^ 1 ). All the eigenvalues of a Hermitian matrix (and a symmetric one) are 
real. For a skew-Hermitian (and a skew-symmetric) matrix they are pure imaginary 
or zero. For a unitary (and an orthogonal) matrix they have absolute value l . 






CHAPTER 9 

Vector Differential Calculus. 
Grad, Div, Curl 


This chapter deals with vectors and vector functions in 3-space, the space of three 
dimensions with the usual measurement of distance (given by the Pythagorean theorem). 
This includes 2-space (the plane) as a special case. It extends the differential calculus to 
those vector functions and the vector fields they represent. Forces, velocities, and various 
other quantities are vectors. This makes the algebra, geometry, and calculus of these vector 
functions the natural instrument for the engineer and physicist in solid mechanics, fluid 
flow, heat flow, electrostatics, and so on. The engineer must understand these vector 
functions and fields as the basis of the design and construction of systems, such as 
airplanes, laser generators, and robots. 

In Secs. 9. 1-9.3 we explain the basic algebraic operations with vectors in 3-space. 
Calculus begins in Sec. 9.4 with the extension of differentiation to vector functions in a 
simple and natural fashion. Application to curves and their use in mechanics follows in 
Sec. 9.5. 

We finally discuss three physically important concepts related to scalar and vector fields, 
namely, the gradient (Sec. 9.7), divergence (Sec. 9.8), and curl (Sec. 9.9). (The use of 
these concepts in integral theorems follows in the next chapter. Their form in curvilinear 
coordinates is given in App. A3.4.) 

We shall keep this chapter independent of Chaps. 7 and 8. Our present approach is in 
harmony with Chap. 7, with the restriction to two and three dimensions providing for a 
richer theory with basic physical, engineering, and geometric applications. 

Prerequisite: Elementary use of second- and third-order determinants in Sec. 9.3. 

Sections that may be omitted in a shorter course: 9.5, 9.6. 

References and Answers to Problems: App. 1 Part B, App. 2. 


9.1 Vectors in 2-Space and 3-Space 

In physics and geometry and its engineering applications we use two kinds of quantities: 
scalars and vectors. A scalar is a quantity that is determined by its magnitude; this is the 
number of units measured on a suitable scale. For instance, length, voltage, and temperature 
are scalars. 

A vector is a quantity that is determined by both its magnitude and its direction. Thus 
it is an arrow or directed line segment. For instance, a force is a vector, and so is a 
velocity, giving the speed and direction of motion (Fig. 162). 
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We denote vectors by lowercase boldface letters a, b, v, etc. In handwriting you may 
use arrows, for instance a (in place of a), b , etc. 

A vector (arrow) has a tail, called its initial point, and a tip, called its terminal point. 
This is motivated in the translation (displacement without rotation) of the triangle in Fig. 
163, where the initial point P of the vector a is the original position of a point, and the 
terminal point Q is the terminal position of that point, its position after the translation. 
The length of the arrow equals the distance between P and Q . This is called the length 
(or magnitude) of the vector a and is denoted by |a|. Another name for length is norm 
(or Euclidean norm). 

A vector of length 1 is called a unit vector. 


Velocity 



Fig. 162. Force and velocity 


Fig. 163. Translation 



Of course, we would like to calculate with vectors. For instance, we want to find the 
resultant of forces or compare parallel forces of different magnitude. This motivates our 
next ideas: to define components of a vector, and then the two basic algebraic operations 
of vector addition and scalar multiplication. 

For this we must first define equality of vectors in a way that is practical in connection 
with forces and other applications. 


DEFINITION 


Equality of Vectors 

Two vectors a and b are equal, written a = b, if they have the same length and the 
same direction [as explained in Fig. 164; in particular, note (B)]. Hence a vector 
can be arbitrarily translated; that is, its initial point can be chosen arbitrarily. 



Equal vectors, 
a = b 

(A) 


w 

Vectors having 
the same length 
but different 
direction 

(B) 



Vectors having 
the same direction 
but different 
length 

CC) 


Fig. 164. (A) Equal vectors. (B)-(D) Different vectors 


/ \ 
Vectors having 
different length 
and different 
direction 

CD) 
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EXAMPLE 1 


Components of a Vector 

We choose an xyz Cartesian coordinate system 1 in space (Fig. 165), that is, a usual 
rectangular coordinate system with the same scale of measurement on the three mutually 
perpendicular coordinate axes. Let a be a given vector with initial point P: (x X9 y l9 z\) and 
terminal point Q: (x 2 , y 2 , z 2 )- Then the three coordinate differences 

(1) = x 2 ~ x x , a 2 = y 2 ~ Ji, ci 2 = z 2 - z x 

are called the components of the vector a with respect to that coordinate system, and we 
write simply a = [a l9 a 2 , a z ]. See Fig. 166. 

The length |a| of a can now readily be expressed in terms of components because from 

(1) and the Pythagorean theorem we have 

(2) |a| = Va x 2 + a 2 2 + a 3 2 . 


Components and Length of a Vector 

The vector a with inituii point P: (4. 0. 2) and terminal point Q: (6. —1. 2) has the components 
= 6 — 4 = 2, a 2 = — 0 = — 1 , <i 3 = 2 — 2 = 0. 

Hence a = [2, - 1, 0J. (Can you sketch a, as in Fig. 166?) Equation (2) gives the length 

|a| = V2 2 + (-I) 2 + 0 2 = V5. 

if we choose (—1,5. 8) as the initial point of a, the corresponding terminai point is (1. 4, 8). 

If we choose the origin (0. 0. 0) as the initial point of a, the corresponding terminal point is (2, - 1, 0); its 
coordinates equal the components of a. This suggests that we can determine each point in space by a vector, 
called the position vector of the point, as follows. ■ 

A Cartesian coordinate system being given, the position vector r of a point A: („v, y, z) 
is the vector with the origin (0, 0, 0) as the initial point and A as the terminal point (see 
Fig. 167). Thus in components, r = [x, y, z]. This can be seen directly from (1) with 
*1 = Vl = 2l = 0. 



Fig. 165. Cartesian 
coordinate system 



Fig. 166. Components 
of a vector 



Fig. 167. Position vector r 
of a point A: (x, y, z ) 


1 Named after the French philosopher and mathematician RENATUS CARTESIUS. latinized for REN£ 

DESCARTES (1596-1650). who invented analytic geometry. His basic work Geom&rie appeared in 1637. as 
an appendix to his Discours de la ntethode. 
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THEOREM 1 


DEFINITION 



Fig. 168. Vector 
addition 


Furthermore, if we translate a vector a, with initial point P and terminal point Q, then 
corresponding coordinates of P and Q change by the same amount, so that the differences 
in (1) remain unchanged. This proves 


Vectors as Ordered Triples of Real Numbers 

A fixed Cartesian coordinate system being given, each vector is uniquely determined 
by its ordered triple of corresponding components. Conversely , to each ordered triple 
of real numbers (a l9 a 2 , a 3 ) there corresponds precisely one vector a = [a ly a 2 , a 3 ], 
with (0, 0, 0) corresponding to the zero vector 0, which has length 0 and no direction. 

Hence a vector equation a = b is equivalent to the three equations a 1 = b l9 
a 2 = b 2y a 3 = b 3 for the components. 


We now see that from our “geometric” definition of a vector as an arrow we have arrived 
at an “algebraic” characterization of a vector by Theorem 1. We could have started from 
the latter and reversed our process. This shows that the two approaches are equivalent. 

Vector Addition, Scalar Multiplication 

Applications suggest calculation with vectors that are practically useful and are almost as 
simple as the arithmetics for real numbers. The first is addition and the second is 
multiplication by a number. 


Addition of Vectors 

The sum a + b of two vectors a = [a x> a 2y a 3 ] and b = [b l9 b 2 , b 3 ] is obtained by 
adding the corresponding components, 

(3) a + b = [a r + b x , a 2 + b 2 , a 3 + b 3 ]. 

Geometrically, place the vectors as in Fig. 168 (the initial point of b at the terminal 
point of a); then a + b is the vector drawn from the initial point of a to the terminal 
point of b. 


For forces, this addition is the parallelogram law by which we obtain the resultant of two 
forces in mechanics. See Fig. 169. 

Figure 170 shows (for the plane) that the “algebraic” way and the “geometric way” of 
vector addition give the same vector. 




Fig. 169. Resultant of two forces (parallelogram law) 
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Basic Properties of Vector Addition. Familiar laws for real numbers give immediately 
(see also Figs. 171 and 172) 


( 4 ) 


(a) a + b = b + a ( Commutativity ) 

(b) (u + v) + w = u + (v + w) (Associativity) 

(c) a + 0 = 0 + a = a 

(d) a + (-a) = 0. 


Here -a denotes the vector having the length |a| and the direction opposite to that of a. 

In (4b) we may simply write u + v + w, and similarly for sums of more than three 
vectors. Instead of a 4- a we also write 2a, and so on. This (and the notation —a used 
just before) motivates defining the second algebraic operation for vectors as follows. 


y 




Fig. 170. Vector addition Fig. 171. Cummutativity Fig. 172. Associativity 

of vector addition of vector addition 


DEFINITION 

/ // / 
a 2a -a a 

Fig. 173. Scalar 
multiplication 
[multiplication of 
vectors by scalars 
(numbers)] 


Scalar Multiplication (Multiplication by a Number) 

The product ca of any vector a = [a x , a 2 , a 3 ] and any scalar c (real number c) is 
the vector obtained by multiplying each component of a by c, 

(5) ca = [ca x , ca 2 , ca 3 \ . 

Geometrically, if a =£ 0 , then ca with c > 0 has the direction of a and with c < 0 
the direction opposite to a. In any case, the length of ca is |ca| = |c||a|, and ca = 0 
if a = 0 or c = 0 (or both). (See Fig. 173.) 


Basic Properties of Scalar Multiplication. From the definitions we obtain directly 


(a) c(a + b) = ca + cb 

(b) (c + k) a = ca 4- ka 

(c) c(kst) = (ck) a 

(d) la = a. 


( 6 ) 


(written cka) 
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EXAMPLE 2 


EXAMPLE 3 


You may prove that (4) and (6) imply for any vector a 


(7) 


(a) Oa = 0 

(b) (-l)a = -a. 


Instead of b + (—a) we simply write b — a (Fig. 174). 

Vector Addition. Multiplication by Scalars 

With respect to a given coordinate system, let 

a = [4,0,1] and b = [2, 

Then -a = [-4. 0,-1], 7a - [28, 0, 7|, a + b * [6, -5. §], and 

2(a - b) = 2[2, 5, §] = [4, 10, §] = 2a - 2b. ■ 

Unit Vectors i, j, k. Besides a = [a lf a 2 , a 3 ] another popular way of writing vectors is 

(8) a = a x i 4- a 2 j + a 3 k 


In this representation, i, j, k are the unit vectors in the positive directions of the axes of 
a Cartesian coordinate system (Fig. 175). Hence, in components, 

(9) i = [l, 0, 0], j = [0, 1, 0], k = [0, 0, 1] 

and the right side of (8) is a sum of three vectors parallel to the three axes. 

i j k Notation for Vectors 

In Example 2 we have a = 4i + k, b = 2i - 5j + gk, and so on. ■ 

All the vectors a — [a ly a 2i tf 3 ] = + a 2 j + n 3 k (with real numbers as components) 

form the real vector space R 3 with the two algebraic operations of vector addition and 
scalar multiplication as just defined. R 3 has dimension 3. The triple of vectors i, j, k is 
called a standard basis of /? 3 . A Cartesian coordinate system being given, the 
representation (8) of a given vector is unique. 

Vector space i? 3 is a model of a general vector space, as discussed in Sec. 7.9, but is 
not needed in this chapter. 



Fig. 174. Difference 
of vectors 




Fig. 175. The unit vectors i, j, k 
and the representation (8) 
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MS] COMPONENTS AND LENGTH 

Find the components of the vector v with given initial point 
P and terminal point Q. Find |v|. Sketch |v|. Find the unit 
vector in the direction of v. 

1. P: (3, 2, 0), Q: (5, -2, 0) 

2. P: (I, 1, 1), Q: (—4, —4, —4) 

3. P: (1.0, 1.2), Q: (0, 0, 6.2) 

4. P: (2. —2, 0), Q: (0, 4, 6) 

5. P: (4, 3, 2), Q: (-4, -3, 2) 

6. P: (0, 0, 0), Q: (6, 8, 10) 

1 7—1 2 1 Given the components u lt v 2 , v 3 of a vector v 
and a particular initial point P, find the corresponding 
terminal point Q and the length of v. 

7. 3, -1,0; P: (4, 6, 0) 

8. 8, 4, -2; P: (-8, -4, 2) 

9. i 2, |; P: (0, §); 

10. 3, 2, 6; P: (0, 0, 0) 

11. 4,± -f; />:(- 4, ±2) 

12. 3, -3,3: P: (1. 3. —3) 

1 3-20 1 VECTOR ADDITION AND 
SCALAR MULTIPLICATION 

Let a = [2, -1, 0J = 2i - j, 

b = [-4, 2, 5] = -4i 4 2j 4 5k, c = [0, 0, 3] = 3k. 

Find: 

13. 2a, —a, — ^a 14. a 4 2b, 2b 4 a 

15. 5(a - c), 5a - 5c 

16. (3a - 5b) 4 2c, 3a 4* (-5b 4 2c) 

17. 6a — 4b 4 2c, 2(3a — 2b 4 c) 

18. (l/|a|)a, (l/|c|)c 

19. a 4 b 4 c, -3a - 3b - 3c 

20. |a 4 b|, |a| 4 |b| 

21. What laws do Probs. 14-17 illustrate? 

22. Prove (4) and (6). 

23. Find the midpoint of the segment PQ in Probs. 7 and 9. 

24-28 1 FORCES 

Find the resultant (in components) and its magnitude. 

24. p = [1, 2, 0], q = [0, 4, -1], u = [4, 0, -3]. 
v = [6, 2, 4] 

25. p = [2, 2, 2], q = [-4, -4, 0], u = [2, 2, 7] 

26. p = [-1, -3, —5], q = [6, 4, 2], u = [-5, -1, 3] 

27. p = [8, 2, —4], q = 3p, u = — 5p 

28. p = [3, 0, -2], q = [2, 5. 1]. u = 4q 

29. Find v so that v, p, q, u in Prob. 25 are in equilibrium. 

30. For what c is the resultant of [3. 1. 7], [4, 4, 5], and 


[3, 2, c] parallel to the Ay-plane? 

31. Find forces p, q, u in the direction of the coordinate 
axes such that p, q, u, v = [2, 3. 0]. \v = [7, — 1, 1 1] 
are in equilibrium. Are p, q, u uniquely determined? 

32. If |p| = 1 and |q| = 2, what can be said about the 
magnitude and direction of the resultant? Can you think 
of an application where this matters? 

33. Same question as in Prob. 32 if |p| = 3, |q| = 2, |u| = 1. 

34. (Relative velocity) If airplanes A and B are moving 
southwest with speed |v A | = 500 mph and northwest 
with speed |v B | = 400 mph, respectively, what is the 
relative velocity v = v B — v A of B with respect to A? 

35. (Relative velocity) Same question as in Prob. 34 for 
two ships moving northwest with speed |v A | = 20 knots 
and northeast with speed |v B | = 25 knots. 

36. (Reflection) If a ray of light is reflected once in each 
of two mutually perpendicular mirrors, what can you 
say about the reflected ray? 

37. (Rope) Find the magnitude of the force in each rope 
in the figure for any weight w and angle a. 

38. TEAM PROJECT. Geometric Applications. To 
increase your skill in dealing with vectors, use vectors 
to prove the following (see the figures). 

(a) The diagonals of a parallelogram bisect each other. 

(b) The line through the midpoints of adjacent sides 
of a parallelogram bisects one of the diagonals in the 
ratio 1 : 3. 

(c) Obtain (b) from (a). 

(d) The three medians of a triangle (the segments from 
a vertex to the midpoint of the opposite side) meet at 
a single point, which divides the medians in the ratio 
2 : 1 . 

(e) The quadrilateral whose vertices are the midpoints 
of the sides of an arbitrary quadrilateral is a 
parallelogram. 

(f) The four space diagonals of a parallelepiped meet 
and bisect each other. 

(g) The sum of the vectors drawn from the center of 
a regular polygon to its vertices is the zero vector. 



Problem 37 Team Project 38(a) 


Team Project 38(d) Team Project 38(e) 
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9.2 Inner Product (Dot Product) 

We shall now define a multiplication of two vectors that gives a scalar as the product and 
is suggested by various applications, in particular when angles between vectors and lengths 
of vectors are involved. 


DEFINITION 


Inner Product (Dot Product) of Vectors 

The inner product or dot product a*b (read “a dot b”) of two vectors a and b is 
the product of their lengths times the cosine of their angle (see Fig. 176), 


( 1 ) 


a*b = |a||b| cos y 
a*b = 0 


if a # 0, b i= 0 
if a = 0 or b = 0. 


The angle y, 0 ^ y ^ 7 r, between a and b is measured when the initial points of the 
vectors coincide, as in Fig. 176. In components, a = [a lf a 2 , a 3 ], b = [b ly b 2 , 
and 


( 2 ) 


a*b = ciibi + a 2 b 2 + a 3 £> 3 . 


The second line in (1) is needed because y is undefined when a = 0 or b = 0. The 
derivation of (2) from (1) is shown below. 



b b b 

a*b>0 a«b = 0 a*b<0 

Fig. 176. Angle between vectors and value of inner product 


Orthogonality. Since the cosine in (1) may be positive, 0, or negative, so may be the 
inner product (Fig. 176). The case that the inner product is zero is of particular practical 
interest and suggests the following concept. 

A vector a is called orthogonal to a vector b if a*b = 0. Then b is also orthogonal to 
a , and we call a and b orthogonal vectors. Clearly, this happens for nonzero vectors if 
and only if cos y = 0; thus y = tt/2 (90°). This proves the important 


Orthogonality 

The inner product of two nonzero vectors is 0 if and only if these vectors are 
perpendicular. 


THEOREM 1 
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EXAMPLE 1 


Length and Angle. Equation (1) with b = a gives a*a = |a| 2 . Hence 
(3) |a| = Va°a. 


From (3) and (1) we obtain for the angle y between two nonzero vectors 


(4) 


a*b _ a°b 

cosr “ nn ~ v^TvP b - 


Inner Product. Angle Between Vectors 

Find the inner product and the lengths of a = [1. 2, 0] and b = [3, -2. 1] as well as the angle between these 
vectors. 

Solution . a*b = 1 • 3 H- 2 • (-2) + 0-1 = - 1 , |a| = Va^a = V5, |b| = VbM> = VR and (4) gives 
the angle 

y = arccos -^r = arccos (-0.1 1952) = 1.69061 = 96.865°. ■ 

N|b| 

From the definition we see that the inner product has the following properties. For any 
vectors a, b, c and scalars # x , q 2 , 


(5) 


(a) 

(tfia + <? 2 b) * c = q-fi-c + q 2 b»c 

{Linearity) 

(b) 

a°b = b*a 

{Symmetry) 


a*a ^0 1 


(c) 

1 

a»a — 0 if and only if a = 0 J 

{Positive-definiteness). 


Hence dot multiplication is commutative [see (5b)] and is distributive with respect to 
vector addition; in fact, from (5a) with #1 = 1 and q 2 = 1 we have 

(5a*) (a + b)*c = a*c + b*c ( Distributivity ). 

Furthermore, from (1) and |cos y\ ^ 1 we see that 

(6) |a°b| g |a||b| (Cauchy-Schwarz inequality ). 

Using this and (3), you may prove (see Prob. 18) 

(7) |a + b| ^ |a| + |b| (Triangle inequality). 


Geometrically, (7) with < says that one side of a triangle must be shorter than the other 
two sides together; this motivates the name of (7). 

A simple direct calculation with inner products shows that 

(8) |a + b| 2 + |a — b| 2 = 2(|a| 2 + |b| 2 ) {Parallelogram equality ). 

Equations (6)-(8) play a basic role in so-called Hilbert spaces (abstract inner product 
spaces), which form the basis of quantum mechanics (see Ref. [GR7] listed in App. 1). 
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EXAMPLE 2 


EXAMPLE 3 


Derivation of (2) from (1). We write a = a x i + a 2 j + and b = Z^i + b 2 j + b 2 k, 
as in (8) of Sec. 9.1. If we substitute this into a*b and use (5a*), we first have a sum of 
3X3 = 9 products 


a*b = a x bi i*i 4- a^i'i + * • * + a 3 Z? 3 k*k. 


Now i, j, k are unit vectors, so that l*i = j • j = k*k = 1 by (3). Since the coordinate 
axes are perpendicular, so are i, j, k, and Theorem 1 implies that the other six of those 
nine products are 0, namely, i*j = j*i = j*k = k*j = k*i = i*k = 0. But this reduces 
our sum for a*b to (2). ■ 

Applications of Inner Products 

Typical applications of inner products are shown in the following examples and in Problem 
Set 9.2. 

Work Done by a Force Expressed as an Inner Product 

This is a major application. It concerns a body on which a constant force p acts. (For a variable force, 
see Sec. 10.1.) Let the body be given a displacement d. Then the work done by p in the displacement is defined as 


(9) 


W = |p||d| cos a = p»d, 


that is, magnitude |p| of the force times length |d| of the displacement times the cosine of the angle a between 
p and d (Fig. 177). If a < 90°. as in Fig. 177, then W > 0. If p and d are orthogonal, then the work is zero 
(why?). If a > 90°. then W < 0, which means that in the displacement one has to do work against the force. 
(Think of swimming across a river at some angle a against the current.) fl 



d 

Fig. 177. Work done by a force 



Component of a Force in a Given Direction 

What force in the rope in Fig. 178 will hold a car of 5000 lb in equilibrium if the ramp makes an angle of 25° 
with the horizontal? 

Solution . Introducing coordinates as shown, the weight is a = [0, —5000) because this force points 
downward, in the negative y-direction. We have to represent a as a sum (resultant) of two forces, a = c + p, 
where c is the force the car exerts on the ramp, which is of no interest to us, and p is parallel to the rope, of 
magnitude (see Fig. 1 78) 

|p| = |a| cos y = 5000 cos 65’ = 21 13 [lb| 

and direction of the uni! vector u opposite to die direction of the rope; here y = 90° - 25° = 65° is the angle 
between a and p. Now a vector in the direction of the rope is 


b = [- 1 . tan 25°] = [-1, 0.46631], thus |b| = 1.10338, 
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so that 


u = - t-t b = [0.90631, -0.42262]. 
M 

Since |u| = 1 and cos y > 0, we see that we can also write our result as 


, , , , a*b 5000*0.46631 

IpI = (M cos y)\n\ = a*u = - -jgp = — = 2113 [lbl * 


Answer: About 2100 lb. I 

Example 3 is typical of applications in which one uses the concept of the component or 
projection of a vector a in the direction of a vector b 0), defined by (see Fig. 179) 

(10) p = |a| cos y. 

Thus p is the length of the orthogonal projection of a on a straight line / parallel to b, 
taken with the plus sign if pb has the direction of b and with the minus sign if pb has the 
direction opposite to b; see Fig. 179. 



Fig. 179. Component of a vector a in the direction of a vector b 


Multiplying (10) by |b|/|b| = 1, we have a*b in the numerator and thus 
(11) p = (b * 0). 


If b is a unit vector, as it is often used for fixing a direction, then (11) simply gives 
(12) p = a-b (|b| = 1). 


Figure 180 shows the projection p of a in the direction of b (as in Fig. 179) and the 
projection q = |b| cos y of b in the direction of a. 


a 



Fig. 180. Projections p of a on b and q of b on a 
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EXAMPLE 4 


EXAMPLE 5 


EXAMPLE 6 


Orthonormal Basis 

By definition, an orthonormal basis for 3-space is a basis {a, b, c) consisting of orthogonal unit vectors. It has 
the great advantage that the determination of the coefficients in representations v = /j a + /2b + /3C of a given 
vector v is very simple. We claim that l l = a*v, / 2 = b*v, / 3 = c* v. Indeed, this follows simply by taking 
the inner products of the representation with a, b, c, respectively, and using the orthonormality of the basis, 
a»v = /ja*a + / 2 a«b + / 3 a # c = / 1? etc. 

For example, the unit vectors i, j, k in (8), Sec. 9.1, associated with a Cartesian coordinate system form an 
orthonormal basis, called the standard basis with respect to the given coordinate system. ■ 

Orthogonal Straight Lines in the Plane 

Find the straight line L\ through the point P: (1. 3) in the .vy-plane and perpendicular to the straight line 
L 2 : x - 2 y + 2 = 0; see Fig. 181. 

Solution. The idea is to write a general straight line dj.v + a 2 y = c as a*r = c with a = aji =£ 0 
and r = [.r, v], according to (2). Now the line through the origin and parallel to L x is a*r = 0. Hence, by 
Theorem 1 , the vector a is perpendicular to r. Hence it is perpendicular to L x * and also to Li because L% and 
Lj* are parallel, a is called a normal vector of L\ (and of L \ *). 

Now a normal vector of the given line x - 2y + 2 = 0 is b = [I, -2]. Thus L Y is perpendicular to L 2 if 
b*a = a 1 - 2a 2 = 0, for instance, if a = [2, 1]. Hence L x is given by lx + y = c. It passes through P: (1, 3) 
when 2*1 + 3 = c = 5. Answer: y = -2v + 5. Show that the point of intersection is (.v, y) = (1.6, 1.8). H 

Normal Vector to a Plane 

Find a unit vector perpendicular to the plane 4.v + 2y + 4z = -7. 

Solution. Using (2), we may write any plane in space as 

(13) a«r = tfj.v + a 2 y + a 3 z = c 

where a = a 2 > 03] =£ 0 and r = [a*, y, 2]. The unit vector in the direction of a is (Fig. 182) 


Dividing by |a|, we obtain from (13) 

(14) n*r = p 


M 


where 


P = 


M ‘ 


From (12) we see that p is the projection of r in the direction of n. This projection has the same constant value 
c/|a| for the position vector r of any point in the plane. Clearly this holds if and only if n is perpendicular to 
the plane, n is called a unit normal vector of the plane (the other being -n). 

Furthermore, from this and the definition of projection it follows that |/>| is the distance of the plane from the 
origin. Representation (14) is called Hesse’s 2 normal form of a plane. In our case, a = [4, 2, 4j, 
c = -7, |a| = 6, n = ga = [f. 3, §], and the plane has the distance 7/6 from the origin. ■ 




Fig. 182. Normal vector to a plane 


LUDWIG OTTO HESSE (181 1—1874), German mathematician who contributed to the theory of curves and 
surfaces. 
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PROBLEM S^T 9.2 


1-12 


INNER PRODUCT 


Let a = [2, I, 4], b = [-4, 0, 3], c = [3, -2, 1]. Find 


1. a*b, b*a 

3. |3a - 2b|, |2b - 3a| 
5. (a • b)c, a(b • c) 

7. (a — b)*c, a*c - b*c 
9. a*(b — c), a*(c — b) 
11. 6(a 4- b) • (a - b) 


2. |a|, |b|, |e| 

4. a*(b 4- c), a*b 4- a*c 
6. a*b 4- b*c + c*a 
8. 4a* 3c, 12a *c 
10. |b + c|, |b| + |c| 

12. |a* c|, |a||c| 


13. What laws do Probs. 1, 3, 4, 7, 8 illustrate? 

14. Does u*v = u*w with u ^ 0 imply that v = w? 

15. Prove the Cauchy-Schwarz inequality. 

16. Verify the Cauchy-Schwarz inequality, the triangle 
inequality, and the parallelogram equality for the above 
a and b. 

17. Prove the parallelogram equality. 

18. (Triangle inequality) Prove (7). Hint. Use (3) for 
|a 4- b| and (6) to prove the square of (7). then take 
roots. 


19-22 


WORK 


Find the work done by a force p acting on a body if the 
body is displaced from a point A t o a point B along the 
straight segment AB . Sketch p and AB. (Show the details 
of your work.) 


19. p = [8, -4, 11], A: (1, 2, 0), B: (3, 6. 0) 

20. p = [2, 7, —4], A* (3, 1, 0), B: (0, 2, 0) 

21. p = [5, -2, 1], A: (4, 0, 3), B: (6, 0, 8) 


22. p = [4, 3, 6], A: (5, 2, 10), B: (1, 3, 1) 


23. Why is the work in Prob. 19 zero? Can work be 
negative? Explain. 

24. Show that the work done by the resultant of p and q 
in a displacement from A to B is the sum of the work 
done by each force in that displacement. 

25. Find the work W=p*difd = 2i and p = i, i 4- j, 
j, — i 4- j and sketch a figure similar to Fig. 177. 


26-30 


ANGLE BETWEEN VECTORS. 
ORTHOGONALITY 


Let a = [1, 1, 1], b = [2, 3, 1]. c = [-1, 1, 0]. Find the 
angle between: 

26. a, b 27. b, c 28. a — c, b — c 
29. a 4- b, c 30. a, b 4- c 


31. (Planes) Find the angle between the planes 
a* 4* y 4- z = 1 and 2x — y 4- 2z = 0. 


32. (Cosine law) Deduce the law of cosines by using 
vectors a, b, and a — b. 

33. (Triangle) Find the angles of the triangle with vertices 
[0, 0, 0], [1, 2, 3], [4, -1,3]. 

34. (Addition law) Obtain 

cos (<* — / 3) = cos a cos /3 4- sin a sin ft 
by using a = [cos or, sin a], b = [cos /3, sin /3], where 
0 = a = /3 = 27 t. 

35. (Parallelogram) Find the angles if the sides are [5, 0] 
and [1,2]. 

36. (Distance) Find the distance of the plane 
5x + 2y + z = 10 from the origin. 


37-40 


COMPONENTS IN THE DIRECTION 
OF A VECTOR 


Find the component of a in the direction of b. 

37. a = [1, I, 3], b = [0,0,5] 

38. a = [2, 0, 6], b = [3,4, -1] 

39. a = [0, 4, -3], b = [0, 4, 3] 

40. a = [— 1, 2, 0], b = [l, -2,0] 


41. Under what condition will the projection of a in the 
direction of b equal the projection of b in the direction 
of a? 

42. TEAM PROJECT. Orthogonality is particularly 
important, mainly because of the use of orthogonal 
coordinates, such as Cartesian coordinates, whose 
“natural basis” (9), Sec. 9.1, consists of three 
orthogonal unit vectors. 

(a) Show that a = [2, -2, 4], b = [0, 8, 4], 
c = [—20, —4, 8] are orthogonal. 

(b) For what values of a x are a = [a l9 2, 0] and 
b = [3, 4, -1] orthogonal? 

(c) Show that the straight lines 4.v 4- 2y = 1 and 
5x — lOv = 7 are orthogonal. 

(d) Find all unit vectors a = [r^, a 2 J in the plane 
orthogonal to [4, 3]. 

(e) Find all vectors orthogonal to a = [2, 1, 0]. Do 
they form a vector space? 

(f) For what c are the planes 4.v — 2y 4- 3z = 6 and 
2x — cy 4- 5z = 1 orthogonal? 

(g) Under what condition will the diagonals of a 
parallelogram be orthogonal? (Prove your answer.) 

(h) What is the angle between a light ray and its 
reflection in three orthogonal plane mirrors (known as 
a “corner reflector”)? 

(i) Discuss further applications in physics and 
geometry in which orthogonality plays a role. 
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9.2 Vector Product (Cross Product) 

The dot product in Sec. 9.2 is a scalar. We shall see that in some applications, for instance, 
in connection with rotations, we shall need a product that is again a vector : 


DEFINITION 


Vector Product (Cross Product, Outer Product) of Vectors 

The vector product (also called cross product or outer product) a x b (read “a 
cross b”) of two vectors a and b is the vector 

v = a x b 

as follows. If a and b have the same or opposite direction, or if a = 0 or b = 0, 
then v = a x b = 0. In any other case v = a x b has the length 

(1) M = |a x b| = |a||b| sin y. 

This is the area of the blue parallelogram in Fig. 183. y is the angle between a and 
b (as in Sec. 9.2). The direction of v = a x b is perpendicular to both a and b and 
such that a, b, v, in this order, form a right-handed triple as in Figs. 183-185 
(explanation below). 


In components, let a = [a x , a 2 , 03 ] and b = [b x , b 2 , £ 3 ]. Then v = [u lf v 2> v 3 ] = a x b 
has the components 

(2) v x = a 2 b 3 - a 3 b 2 , v 2 = a 3 b x - a x b 3i v 3 = a x b 2 - a 2 b x . 

Here the Cartesian coordinate system is right-handed , as explained below (see also 
Fig. 186). (For a left-handed system, each component of v must be multiplied by — 1. 
Derivation of (2) in App. 4.) 

Right-Handed Triple. A triple of vectors a, b, v is right-handed if the vectors in the 
given order assume the same sort of orientation as the thumb, index finger, and middle 
finger of the right hand when these are held as in Fig. 1 84. We may also say that if a is 
rotated into the direction of b through the angle y (< 7 r), then v advances in the same 
direction as a right-handed screw would if turned in the same way (Fig. 185). 



Fig. 183. Vector product Fig. 184. Right-handed Fig. 185. Right-handed 

triple of vectors a, b, v screw 
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EXAMPLE 




Fig. 186. The two types of Cartesian coordinate systems 


Right-Handed Cartesian Coordinate System. The system is called right-handed if 
the corresponding unit vectors i, j, k in the positive directions of the axes (see Sec. 9.1) 
form a right-handed triple as in Fig. 186a. The system is called left-handed if the sense 
of k is reversed, as in Fig. 186b. In applications, we prefer right-handed systems. 


How to Memorize (2). If you know second- and third-order determinants, you see that 
( 2 ) can be written 



a 2 

a 3 


a i 

*3 


a 3 

a x 



a 2 

( 2 *) l>! = 

b 2 

b 3 

, v 2 = ~ 

b 1 

b 3 

= + 

*3 

b 1 

, »3 = 

b 1 

*2 


and v = [i>!, u 2 , *> 3 ] = v x i -I- 4- u 3 k is the expansion of the following symbolic 

determinant by its first row. (We call the determinant “symbolic” because the first row 
consists of vectors rather than of numbers.) 


i j k 







CI 2 

a 3 

i — 


% 

j + 

a 1 

0-2 

v = a x b = 

a x 

«2 

as 


*2 

b 3 

b 1 

b 3 

b 1 

b 2 




1*1 b 2 b 3 \ 

For a left-handed system the determinant has a minus sign in front. 

1 Vector Product 

For the vector product v = a x b of a = [1, l t OJ and b = [3, 0, 0] in right-handed coordinates we obtain 
from (2) 

Ui = 0, v 2 = 0, u 3 =• 1 • 0 — 1 ■ 3 = -3. 


We confirm this by (2**): 


v = a x b = 


* J 
1 1 
3 0 


k 

0 

0 


1 0 


1 0 


l 1 


i — 


j + 


0 0 


3 0 


3 0 


To check the result in this simple case, sketch a, b, and v. Can you see that two vectors in the Ay-plane must 
always have their vector product parallel to the z-axis (or equal to the zero vector)? ■ 
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EXAMPLE 2 Vector Products of the Standard Basis Vectors 


(3) 


i x j = k, 
j x i = -k, 


We shall use this in the next proof. 


j x k = i, 
k x j = -i, 


k x i = j 
i x k = -j. 


THEOREM 1 



Fig. 187. 

Anticommutativity 
of cross 
multiplication 


General Properties of Vector Products 

(a) For every scalar l, 

(4) (/a) x b = /(a x b) = a x (Zb). 

(b) Cross multiplication is distributive with respect to vector addition; that is, 

(a) a x (b + c) = (a x b) 4- (a x c), 

(5) 

(/3) (a + b) x c = (a x c) + (b x c). 

(c) Cross multiplication is not commutative but anticommutative; that is, 

(6) b x a = -(a x b) (Fig. 187). 

(d) Cross multiplication is not associative; that is, in general , 

(7) a x (b x c) * (a x b) x c 
so that the parentheses cannot be omitted. 


PROOF (4) follows directly from the definition. In (5ck), formula (2*) gives for the first component 
on the left 


*2 

b 2 + ^2 


«3 


^3 + c 3 


= ci 2 (b 3 4- c 3 ) - a z (b 2 + c 2 ) 


— (^ 2^3 a z b 2 ) 4- {ct 2 c z ci z c 2 ) 



«2 

a s 


a 2 

*3 

= 



4* 




b 2 

h 


c 2 

^3 


By (2*) the sum of the two determinants is the first component of (a x b) 4- (a x c), the 
right side of (5a). For the other components in (5a) and in (5)8), equality follows by the 
same idea. 

Anticommutativity (6) follows from (2**) by noting that the interchange of Rows 2 
and 3 multiplies the determinant by — 1 . We can confirm this geometrically if we set 
a x b = v and b x a = w; then |v| = |w| by (I), and for b, a, w to form a right-handed 
triple, we must have w = — v. 

Finally, i x (i x j) = i x k = -j, whereas (i x i) x j = 0 x j = 0 (see Example 
2). This proves (7). ■ 
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EXAMPLE 3 


EXAMPLE 4 


Typical Applications of Vector Products 

Moment of a Force 

In mechanics the moment m of a force p about a point Q is defined as the product m = |p| d % where d is the 
(perpendicular) distance between Q and the line of action L of p (Fig. 188). If r is the vector from Q to any 
point A on L % then d = |r| sin y (Fig. 188) and 


m = |r| |p| sin y. 

Since y is the angle between r and p, we see from ( 1 ) that m = |r x p|. The vector 
(8) m = rxp 

is called the moment vector or vector moment of p about Q. Its magnitude is m. If m ¥= 0, its direction is 
that of the axis of the rotation about Q that p has the tendency to produce. This axis is perpendicular to both 
r and p. M 



Moment of a Force 

Find the moment of the force p in Fig. 189 about the center Q of the wheel. 
Solution. Introducing coordinates as shown in Fig. 189, we have 


p = [1000 cos 30°, 1000 sin 30°. 0] = [866, 500, 0], r = [0. 1.5, 0). 


(Note that the center of the wheel is at y = — 1 .5 on the y-axis.) Hence (8) and (2**) give 


m = r x p = 


i 

j 

k 








0 

1.5 

0 

1.5 

0 

= 0i - Oj + 






866 

500 

866 

500 

0 





[0, 0, -1299]. 


This moment vector is normal (perpendicular) to the plane of the wheel; hence it has the direction of the axis 
of rotation about the center of the wheel that the force has the tendency to produce, m points in the negative 
z-direction, the direction in which a right-handed screw would advance if turned in that way. H 



Fig. 189. Moment of a force p 
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EXAMPLE 5 


Velocity of a Rotating Body 

A rotation of a rigid body B in space can be simply and uniquely described by a vector w as follows. The 
direction of w is that of the axis of rotation and such that the rotation appears clockwise if one Looks from the 
initial point of w to its terminal point. The length of w is equal to the angular speed c o (> 0) of the rotation, 
that is, the linear (or tangential) speed of a point of B divided by its distance from the axis of rotation. 

Let P be any point of B and d its distance from the axis. Then P has the speed tod. Let r be the position 
vector of P referred to a coordinate system with origin 0 on the axis of rotation. Then d — |r[ sin y, where y is 
the angle between w and r. Therefore, 

aid = |w| |r| sin y = |w x r|. 

From this and the definition of vector product we see that the velocity vector v of P can be represented in the 
form (Fig. 190) 

(9) v = w x r. 

This simple formula is useful for determining v at any point of B , I 




Fig. 190. Rotation of a rigid body 


Scalar Triple Product 

The most important product of vectors with more than two factors is the scalar triple 
product or mixed triple product of three vectors a, b, c. It is denoted by (a b c) and 
defined by 

(10*) (a b c) = a*(b x c). 

Because of the dot product it is a scalar. In terms of components a = [a 1% a 2 , a 3 ], 
b = [ b x , b 2> b 3 ], c = [c x , c 2 , c 3 ] we can write it as a third-order determinant. For this we 
set b x c = v = [u lf u 2 > v s]‘ Then fr° m the dot product in components [formula (2) in 
Sec. 9.2] and from (2*) with b and c instead of a and b we first obtain 

a # (b x c) = a*v = aiu v + o 2 v 2 4* a 3 v 3 



b 2 b 3 


b 3 Z?x 


b 1 b 2 

= a x 


+ a 2 


+ a 3 



c 2 C 3 


C3 Cl 


C 1 c 2 


The sura on the right is the expansion of a third-order determinant by its first row. Thus 


( 10 ) 


(a b c) = a*(b x c) = 


cii ci 2 a 3 

b\ b 2 b 3 

C 1 c 2 c 3 1 
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THEOREM 2 


PROOF 


EXAMPLE 6 


The most important properties of the scalar triple product are as follows. 


Properties and Applications of Scalar Triple Products 

(a) In (10) the dot and cross can be interchanged : 

(11) (a b c) = a*(b x c) = (a x b)*c. 

(b) Geometric interpretation. The absolute value |(a b c)| of (10) is the 
volume of the parallelepiped (oblique box) with a, b, c as edge vectors (Fig. 191). 

(c) Linear independence. Three vectors in R 3 are linearly independent if and 
only if their scalar triple product is not zero . 


(a) Dot multiplication is commutative, so that by (10) 


(a x b)*c = c*(a x b) = 


c 2 c 3 
a l a 2 a 3 

bi b 2 b 3 


From tills we obtain the determinant in (10) by interchanging Rows 1 and 2 and in the 
result Rows 2 and 3. But this does not change the value of the determinant because each 
interchange produces a factor - I, and ( — 1)( — 1) = 1. This proves (11). 

(b) The volume of that box equals the height h = |a||cos y\ (Fig. 191) times the area 
of the base, which is the area |b X c| of the parallelogram with sides b and c. Hence the 
volume is 

|a||b x c| |cos y\ = |a*(b x c)| (Fig. 191) 

as given by the absolute value of (11). 

(c) Three nonzero vectors, if we let their initial points coincide, are linearly independent 
if and only if they do not lie in the same plane (or do not lie on the same straight line). 
This happens if and only if the triple product in (b) is not zero, so that the independence 
criterion follows. (The case that one of the vectors is the zero vector is trivial.) ■ 



Fig. 191. Geometric interpretation of a scalar triple product 


Tetrahedron 

A tetrahedron is determined by three edge vectors a, b, c, as indicated in Fig. 192. Find its volume when 
a = [2. 0, 3], b = [0, 4, 1], c = 15, 6. OJ. 

Solution . The volume V of the parallelepiped with these vectors as edge vectors is the absolute value of the 
scalar triple product 
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Hence V = 72. The minus sign indicates that if the coordinates are right-handed, the triple a, b, c is left-handed. 
Fig. 192. The volume of a tetrahedron is £ of that of the parallelepiped (can you prove it?), hence 12. 

Tetrahedron Can you sketch the tetrahedron, choosing the origin as the common initial point of the vectors? What are the 

coordinates of the four vertices? H 

This is the end of vector algebra (in space R 3 and in the plane). Vector calculus 
(differentiation) begins in the next section. 


3EROBLEM SET — 9.3- .= 


[1-20 1 VECTOR PRODUCT, SCALAR TRIPLE 
PRODUCT 

With respect to right-handed Cartesian coordinates, let 
a = [1, 2, 0], b = [3, -4, 0], c = [3. 5, 2], d = [6, 2, -3]. 
Showing details, find: 

1. a x b, b x a 2. a x c, |a x c|, a*c 

3. (a + b) x c, a x c + b x c 

4. (c + d) x d, c x d 

5. 2a x 3b, 3a x 2b, 6a x b 

6. bxc + cxb 

7. a*(b x c), (a x b)*c 

8. (a 4- b) x (b + a) 

9. (a x b)*(c x d), (b x a)*(d x c) 

10. (a x b) x c, a x (b x c) 

11. d x c, |d x c|, |c x d| 

12. (a + b) x (c + d) 

13. a x (b 4- c - d) 

14. (i j k), (i k j) 

15. (i 4- j j + k k 4* i) 

16. (b x c)*d, b*(c x d) 

17. (a b d), |(a b d)|, (b a d) 

18. (a 4* b b 4- c c + d) 

19. (a - c b - c c), (a b c) 

20. (4a 3b 2c), 24 (b c a) 

21. What properties of cross multiplication do Probs. 1, 3, 

8. 10 illustrate? 

22. Give the details of the proofs of (4) and (5). 

23. Give the details of the proofs of (6) and (11). 

24. TEAM PROJECT. Useful Formulas for Two and 
More Vectors. Prove (12)— (16), which are often useful 
in practical work, and illustrate each formula with two 
examples. Hints . For (13) choose Cartesian coordinates 
such that d = [d^ 0, 0] and c = [c lt c 2 , 0], Show that 


each side of (13) then equals [— & 2 c 2 ^i> 0], and 

give reasons why the two sides are then equal in any 
Cartesian coordinate system. For ( 14) and (15) use ( 1 3). 
Formula (15) is called Lagrange’s identity. 

(12) |a x b| = V(a*a)(b*b) - (a*b) 2 

(13) b x (c x d) = (b*d)c - (b»c)d 

(14) (a x b) x (c x d) 

= (a b d)c - (a b c)d 

(15) (a x b)»(c x d) = (a*c)(b*d) - (a*d)(b # c) 

(a b c) = (b c a) = (c a b) 

(16) 

= -(c b a) = -(a c b) 

25-28 1 MOMENT OF A FORCE 

Find the moment vector m and the moment m of a force p 
about a point Q when p acts on a line through A. 

25. p = [4, 4. 0), Q: (2, 1. 0), A: (0, 3, 0) 

26. p = [0, 0, 5], Q: (3, 3, 0), A: (0, 0, 0) 

27. p = [1, 2, 3], Q: (0, 1, 1), A: (I, 0, 3) 

28. p = [4, 12, 8], Q: (3, 0, 5), A: (4, 3, 7) 

29. (Rotation) A wheel is rotating about the y-axis with 
angular speed a> = 10 sec -1 . The rotation appears 
clockwise if one looks from the origin in the positive 
v-direction. Find the velocity and speed at the point 
(4, 3, 0). 

30. (Rotation) What are the velocity and speed in Prob. 
29 at the point (4, 2, -2) if the wheel rotates about the 
line y = a*, z — 0 with co = 5 sec -1 . 

GEOMETRIC APPLICATIONS 

31. (Parallelogram) Find the area if the vertices are (2. 2), 
(9, 2), (10, 3), (3, 3). 
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32. (Parallelogram) Find the area if the vertices are (3, 9, 8), 
(0,5, 1), (-1, -3, -3), (2, 1,4). 

33. (Triangle) Find the area if the vertices are (1, 0, 0), 

( 0 , 1 , 0 ), ( 0 , 0 , 1 ). 

34. (Triangle) Find the area if the vertices are (4, 6, 5), 
(4, 9, 5), (8, 6, 7). 

35. (Plane) Find a normal vector and a representation of the 
plane through the points (4, 8, 0), (0, 2, 6), (3, 0, 5). 

36. (Plane) Find the plane through (2, 1, 3), (4, 4, 5), 

( 1 , 6 , 0 ). 


37. (Parallelepiped) Find the volume of the parallelepiped 
determined by the vertices (1, l, 1), (4, 7, 2). (3, 2, 1), 
(5, 4, 3). 

38. (Tetrahedron) Find the volume of the tetrahedron with 
vertices (0, 2, 1), (4, 3, 0), (6, 6, 5), (4, 7, 8). 

39. (Linear dependence) For what c are the vectors [9, 1 , 2], 
[— 1, c, 5], [4, c, 5] linearly dependent? 

40. WRITING PROJECT. Applications of Cross 
Products. Summarize the most important applications 
we have discussed in this section and give a few simple 
examples. No proofs. 


9.4 Vector and Scalar Functions and Fields. 
Derivatives 


We now begin with vector calculus. This calculus concerns two kinds of functions, namely, 
vector functions, whose values are vectors 

V = \{P) = [y x (P), V 2 (P), V 3 (P)] 

depending on the points P in space, and scalar functions, whose values are scalars 

/ = f(P) 

depending on P. Here, P is a point in the domain of definition, which in applications is 
a (three-dimensional) domain or a surface or a curve in space. We say that a vector function 
defines a vector field, and a scalar function defines a scalar field in that domain or on 
that surface or curve. Examples of vector functions are shown in Figs. 193-196. Examples 
of scalar fields are the temperature field in a body or the pressure field of the air in the 
earth’s atmosphere. Vector and scalar functions may also depend on time t or on some 
other parameters. 

Notation. If we introduce Cartesian coordinates x , y, z, then instead of \(P) we can also 
write 

v(a-, y, z) = [y 1 (.v, y, z), v 2 (x, y, z), v 3 (x, y, z)], 



Fig. 193. Field of tangent 
vectors of a curve 



Fig. 194. Field of normal 
vectors of a surface 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


but we keep in mind that components depend on the choice of a coordinate system, whereas 
a vector field that has a physical or a geometric meaning should have magnitude and 
direction depending only on P, not on that choice. Similarly for the value of a scalar field 
/CP) = f(x ; y, z). 

Scalar Function (Euclidean Distance in Space) 

The distance f(P ) of any point P from a fixed point P 0 in space is a scalar function whose domain of definition 
is the whole space. f{P) defines a scalar field in space. If we introduce a Cartesian coordinate system and P 0 
has the coordinates a 0 . yo, Zq, then / is given by the well-known formula 

rn = /(.v, y, Z) = V(A- - *o ) 2 + Vy - yof + (Z - zo ) 2 

where x. y. z are the coordinates of P. If we replace the given Cartesian coordinate system with another such 
system by translating and rotating the given system, then the values of the coordinates of P and P 0 will in general 
change, but f(P) will have the same value as before. Hence f(P) is a scalar function. The direction cosines of 
the straight line through P and P 0 are not scalars because their values depend on the choice of the coordinate 
system. H 


Vector Field (Velocity Field) 

At any instant the velocity vectors v(P) of a rotating body B constitute a vector field, called the velocity field 
of the rotation. If we introduce a Cartesian coordinate system having the origin on the axis of rotation, then (see 
Example 5 in Sec. 9.3) 


( 1 ) 


v(a, y, z) = W X r = w X [a\ y, z] = w X (.vi + yj + zk) 


where x , y, z are the coordinates of any point P of B at the instant under consideration. If the coordinates are 
such that the z-axis is the axis of rotation and w points in the positive z-direction, then w = a>k and 


v = 


i j k 

0 0 <o 

x y z 


= o>[-y, x, 0] = aj(-yi + aJ). 


An example of a rotating body and the corresponding velocity field are shown in Fig. 195. 



Fig. 195. Velocity field of a rotating body 

Vector Field (Field of Force, Gravitational Field) 

Let a particle A of mass M be fixed at a point P 0 and let a particle B of mass m be free to take up various 
positions P in space. Then A attracts B. According to Newton’s law of gravitation the corresponding gravitational 
force p is directed from P to P 0 , and its magnitude is proportional to 1 fr 2 , where r is the distance between 
P and Pq, say, 

IpI = 7 - • 


( 2 ) 


c = GMm . 
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Here G = 6.67 • 10~ 8 cm 3 /(gm • sec 2 ) is the gravitational constant. Hence p defines a vector field in space. If 
we introduce Cartesian coordinates such that P 0 has the coordinates .v 0 , yo> -o an ^ P has ^ ie coordinates a*, v, z, 
then by the Pythagorean theorem, 

r = V(.v - .v 0 ) 2 + (.v - ,v 0 ) 2 + (z - : 0 ) 2 » 0). 

Assuming that r > 0 and introducing the vector 


r = [a- - .v 0 , y - y 0 , z- Zo] = (x - A 0 )i + (.v - y 0 )j + (z - zo) k » 

we have |r| = r. and (— l/r)r is a unit vector in the direction of p; the minus sign indicates that p is directed 
from P to P 0 (Fig. 196). From this and (2) we obtain 


(3) 


, , ( 1 \ c r .V - A 0 y - ,v 0 

P = IPl r) = 3 r = c 3 f c 3 , 

2 “ ZO 1 

3 

\ f ) r L / r 

a- A' 0 . y ~ j’o . z - Zo . 

- c r 3 1 c ,.3 „3 k - 

/ r r 

r A 


This vector function describes the gravitational force acting on B. 


* s 

i 

Fig. 196. Gravitational field in Example 3 


Vector Calculus 

We show next that the basic concepts of calculus, namely, convergence, continuity, and 
differentiability, can be defined for vector functions in a simple and natural way. Most 
important here is the derivative. 

Convergence. An infinite sequence of vectors a< n) , n = 1, 2, • • • , is said to converge 
if there is a vector a such that 

(4) lim |a< n) - a| = 0. 

n-+z c 

a is called the limit vector of that sequence, and we write 

(5) lim a (n) = a. 

n-*-zc 

Cartesian coordinates being given, this sequence of vectors converges to a if and only 
if the three sequences of components of the vectors converge to the corresponding 
components of a. We leave the simple proof to the student. 



SEC 9.4 Vector and Scalar Functions and Fields. Derivatives 


387 


Similarly, a vector function v(r) of a real variable t is said to have the limit Z as t 
approaches t 0 , if v(r) is defined in some neighborhood of r 0 (possibly except at r 0 ) and 

(6) lim |v(t) - l\ = 0. 

t—*to 

Then we write 

(7) lim v(r) = l 

t — *to 

Here, a neighborhood of t 0 is an interval (segment) on the f-axis containing t 0 as an interior 
point (not as an endpoint). 

Continuity, A vector function \(t) is said to be continuous at / = t 0 if it is defined in 
some neighborhood of t 0 (including at t 0 itself!) and 

(8) lim v(t) = v(r 0 ). 

t — »Iq 

If we introduce a Cartesian coordinate system, we may write 

V(0 = KW. V 2 (t), t» 3 (0] = Ui(0i + «2(0j + u 3 (/)k. 

Then v(f) is continuous at r 0 if and only if its three components are continuous at t 0 . 

We now state the most important of these definitions. 


DEFINITION 


Derivative of a Vector Function 

A vector function v(r) is said to be differentiable at a point t if the following limit 
exists: 


( 9 ) 


v'(f) = lim 


At— >0 


\(t + AQ - v(Q 
A t 


This vector v\t) is called the derivative of v(r). See Fig. 197. 



v(t + At) 


Fig. 197. Derivative of a vector function 


In components with respect to a given Cartesian coordinate system, 
d°) v'(r) = t4(4 

Hence the derivative v'(f)is obtained by differentiating each component separately. For 
instance, if v = [;, t 2 , 0], then v' = [1, 2t, 0], 
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EXAMPLE 4 


EXAMPLE 5 


Equation (10) follows from (9) and conversely because (9) is a “vector form” of the 
usual formula of calculus by which the derivative of a function of a single variable is 
defined. [The curve in Fig. 197 is the locus of the terminal points representing v(t) for 
values of the independent variable in some interval containing t and t 4- AMn (9)]. It 
follows that the familiar differentiation rules continue to hold for differentiating vector 
functions, for instance, 

(cv)' = cv' ( c constant), 

(u 4- v)' = u' 4- v' 

and in particular 

(11) (u*v)' = u'*v + u*v' 

(12) (u x v)' = u' x v + u x v' 

(13) (u V w)' = (u' v w) + (u v' w) 4- (u V w'). 

The simple proofs are left to the student. In ( 1 2), note the order of the vectors carefully 
because cross multiplication is not commutative. 

Derivative of a Vector Function of Constant Length 

Let v(/) be a vector function whose length is constant, say, |v(r)| = c. Then |v| 2 = vv = c 2 , and 
(v*v)' = 2v*v' =0, by differentiation [see (11)]. This yields the following result. The derivative of a vector 
function v(/) of constant length is either the zero vector or is perpendicular to v(t). H 


Partial Derivatives of a Vector Function 

Our present discussion shows that partial differentiation of vector functions of two or more 
variables can be introduced as follows. Suppose that the components of a vector function 

v = [v l9 v 2 , v 3 \ = v x i 4- v 2 j + v 3 k 

are differentiable functions of n variables t l9 • • * , t n . Then the partial derivative of v 
with respect to t m is denoted by d\/dt m and is defined as the vector function 

dv dVi du 3 

— — l 4- j 4* k. 

dt-m dt m dt m dt m 

Similarly, second partial derivatives are 

d 2 y = (Pv 1 . + d 2 v 2 . + d 2 V 3 k 

dtidl m dt L dt m dtidt m ^ dtjdt m 

and so on. 

Partial Derivatives 


Let r(/ lf t 2 ) ~ a cos t x i + a sin ti j 4- r 2 k. Then 


dr 


dr 

-a sin tj i + a cos j and = k. M 

dt 2 


Various physical and geometric applications of derivatives of vector functions will be 
discussed in the next sections and in Chap. 10. 
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[Mi] SCALAR FIELDS 

Determine the isotherms (curves of constant temperature 
T) of the temperature fields in the plane given by the 
following scalar functions. Sketch some isotherms. 

1. T = Ay 2. T = 4a — 3y 

3. T = y 2 - a * 2 4. T = xl(x 2 + v 2 ) 

5. T = y/( a 2 + y 2 ) 6. T = x 2 - y 2 + 8y 

7. (Isobars) For the pressure field /(a, y) = 9a 2 + I6y 2 
find the isobars f(x, y) = const, the pressure at (4, 3), 
(-2, 2), (1, 5), and the region in which the pressure is 
between 4 and 16. 

8. CAS PROJECT. Scalar Fields in the Plane. Sketch 
or graph isotherms of the following fields and describe 
what they look like. 

(a) a 2 — 4a — y 2 (b) x 2 y - y 3 / 3 

(c) cos a sinh y (d) sin a sinh y 

(e) e x sin y (f) e 2x cos 2y 

(g) a 4 - 6A 2 y 2 + y 4 (h) a 2 - 2a - y 2 

1 9—14 | SCALAR FIELDS IN SPACE 

What kind of surfaces are the level surfaces /(a, y, z) = const? 

9. / = a 2 -r y 2 + 4z 2 10. / = a 2 + 4y 2 


11. / = Z - Va 2 + y 2 12. / = 4y 2 - z 

13. / = 4a + 3y - 5z 14. / = z — a 2 — 4y 2 

VECTOR FIELDS 

figures similar to Fig. 196. 

15. v = i — j 16. v = yi + A*j 

17. v = i + .v 2 j 18. v = A*i + yj 

19. v = yi — Aj 

20. v = (a - y)i + (a + y)j 

21-25 1 DIFFERENTIATION 

21. Prove (1 1)— (13). Give two examples for each formula. 

22. Find the first and second derivatives of 
[4 cos t , 4 sin t , 2 /]. 

23. Find the first partial derivatives of [4 a 2 , 9z 2 , Ayz] and 
[yz, za, Ay]. 

24. Find the first partial derivatives of 

[sin a cosh y, cos x sinh y] and [e x cos y, e x sin y]. 

25. WRITING PROJECT. Differentiation of Vector 
Functions. Summarize the essential ideas and facts and 
give examples of your own. 



Sketch 


9.E Curves. Arc Length. Curvature. Torsion 

A major application of vector calculus concerns curves (this section) and surfaces (Sec. 
10.5) and their use in physics and geometry. This field is called differential geometry. 
It plays a role in mechanics, computer-aided and traditional engineering design, geodesy 
and geography, space travel, and relativity theory (see Refs. [GR8], [GR9] in App. 1). 

Curves C in space may occur as paths of moving bodies. This and other applications 
motivate parametric representations with parameter f, which may be time or something 
else (see Fig. 1 98) 

(1) r (t) = [x(t), y(r), z(t)] = x(t)i + y(t)j + z(t)k. 
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Here a , y , z are Cartesian coordinates (the usual rectangular coordinates; see Sec. 9.1). 
To each value t = t 0 there corresponds a point of C with position vector r(f 0 X that is, 
with coordinates a(/ 0 ), v(/ 0 ), z(t 0 )> 

Parametric representations (1) have a key advantage over representations of a curve C 
in terms of its projections into the Ay-plane and into the Ac-plane, that is, 

(2) y = /«, z = g(x) 

(or by a pair of equations with y or with z as the independent variable). The advantage is 
that in (1) the coordinates a, y, z play the same role: all three are dependent variables. 
Moreover, the sense of increasing t , called the positive sense on C, induces an orientation 
of C, a direction of travel along C. The sense of decreasing t is then called die negative 
sense on C, given by (1). 

EXAMPLE 1 Circle 

The circle a 2 + y 2 = 4, z = 0 in the ^y-plane with center 0 and radius 2 can be represented parametrically by 

r(/) = [2 cos /, 2 sin /, 01 or simply by r(i) = [2 cos /, 2 sin /] (Fig. 199) 

where 0 ^ t ^ 2tt. Indeed, a 2 + y 2 = (2 cos /) 2 + (2 sin t) 2 = 4(cos 2 / + sin 2 /) = 4. For / = 0 we have 
r(0) = [2, 0], for / = ^ttwc get r(^7r) = [0, 2], and so on. The positive sense induced by this representation 
is the counterclockwise sense. 

If we replace / with /* = — we have i — —I* and get 

r 5 ^/*) = [2 cos (—/*), 2 sin (-r*)] = [2 cos t*< —2 sin /*]. 

This has reversed the orientation, and the circle is now oriented clockwise. H 

EXAMPLE 2 Ellipse 

The vector function 

(3) r(r) = [a cos t. b sin r, 0] = a cos / i + b sin / j (Fig. 200) 

represents an ellipse in the .vy-plane with center at the origin and principal axes in the direction of the a and y 
axes. In fact, since cos 2 1 + sin 2 1 = 1, we obtain from (3) 


2 2 



Fig. 199. Circle in Example 1 Fig. 200. Ellipse in Example 2 
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EXAMPLE 3 Straight Line 

A straight line L through a point A with position vector a in the direction of a constant vector b (see Fig. 201) 
can be represented parametrically in the form 

(4) r(r) = a 4- rb = 4- tb i, «2 "P tbz. a$ + tb$\. 

If b is a unit vector, its components are the direction cosines of L. In this case, |/| measures the distance of the 
points of L from A. For instance, the straight line in the Ay-plane through A: (3, 2) having slope l is (sketch it) 

r(/) = [3, 2, 0] 4-r[L I, 01 = [3+/, 2 + /, 0]. ■ 



Fig. 201. Parametric representation of a straight line 

A plane curve is a curve that lies in a plane in space. A curve that is not plane is called 
a twisted curve. A standard example of a twisted curve is the following. 

EXAMPLE 4 Circular Helix 

The twisted curve C represented by the vector function 

(5) r if) = [n cos r, a sin /, ct] = a cos r i + a sin r j + cl k (c & 0) 

is called a circular helix. It lies on the cylinder a 2 4- y 2 = a 2 . If c > 0, the helix is shaped like a right-handed 
screw (Fig. 202). If c < 0. it looks like a left-handed screw (Fig. 203). If c = 0. then (5) is a circle. M 



Fig. 202. Right-handed circular helix Fig. 203. Left-handed circular helix 

A simple curve is a curve without multiple points, that is, without points at which the 
curve intersects or touches itself. Circle and helix are simple. Figure 204 shows curves 
that are not simple. An example is [sin 2/, cos 0]. Can you sketch it? 

An arc of a curve is the portion between any two points of the curve. For simplicity, 
we say “curve” for curves as well as for arcs. 
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^)0 C 


Fig. 204. Curves with multiple points 



Tangent to a Curve 

The next idea is the approximation of a curve by straight lines, leading to tangents and 
to a definition of length. Tangents are straight lines touching a curve. The tangent to a 
simple curve C at a point P of C is the limiting position of a straight line L through P 
and a point Q of C as Q approaches P along C. See Fig. 205. 

If C is given by r (t), and P and Q correspond to t and i + At, then a vector in the 
direction of L is 

(6) ^ [r(/ + AO - r(0L 

In the limit this vector becomes the derivative 


(7) 


r'W = ljm — [r(f + At) - r(t)}, 

At— *0 l\t 


provided r(0 is differentiable, as we shall assume from now on. If r'(r) ^ 0, we call r'(0 
a tangent vector of C at P because it has the direction of the tangent. The corresponding 
unit vector is the unit tangent vector (see Fig. 205) 


Note that both r' and u point in the direction of increasing t. Hence their sense depends 
on the orientation of C. It is reversed if we reverse the orientation. 

It is now easy to see that the tangent to C at P is given by 

(9) q(vi>) = r+ vvr' (Fig. 206). 

This is the sum of the position vector r of P and a multiple of the tangent vector r' of C 
at P. Both vectors depend on P. The variable w is the parameter in (9). 



Fig. 205. Tangent to a curve Fig. 206. Formula (9) for the tangent to a curve 
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EXAMPLE 5 


Tangent to an Ellipse 

Find the tangent to the ellipse ^.v 2 + y 2 = 1 at P : (V2 r I/V2). 

Solution, Equation (3) with semi-axes a = 2 and b = 1 gives r(/) = [2 cos /, sin /]. The derivative is 
x\t) = [-2 sin r, cos t]. Now P corresponds to / = ir/4 because 

r(7r/4) = [2 cos (tt/4), sin (tt/4)] = [V2, 1/V5]. 

Hence r'(7r/4) = [-V2, 1/V2]. From (9) we thus get the answer 

q(iv) = [V2, 1/V5] + w[W 2, 1/V2] = [V2(l - w), (l/V2)(l + w)]. 

To check the result, sketch or graph the ellipse and the tangent. ■ 

Length of a Curve 

We are now ready to define the length l of a curve. I will be the limit of the lengths of 
broken lines of n chords (see Fig. 207, where n = 5) with larger and larger n. For this, 
let r(f), a ^ t ^ b, represent C. For each n = 1, 2, • • • we subdivide (“partition”) the 
interval a ^ ^ b by points 


to (= h, • • , f n _ 1? f n (= 6), where f 0 < h < • * * < t n . 

This gives a broken line of chords with endpoints r(f 0 )> • • * , r (t n ). We do this arbitrarily 
but so that the greatest |Af m | = \t m — t m ^x \ approaches 0 as n — > »>. The lengths 
l l9 / 2 , * • * of these chords can be obtained from the Pythagorean theorem. If r(r) has a 
continuous derivative r'(r), it can be shown that the sequence l l9 / 2 , * • * has a limit, which 
is independent of the particular choice of the representation of C and of the choice of 
subdivisions. This limit is given by the integral 

( 10 ) /= fVr^r'dt 

J a 



l is called the length of C, and C is called rectifiable. Formula (10) is made plausible in 
calculus for plane curves and is proved for curves in space in [GR8] listed in App. 1. The 
practical evaluation of the integral (10) will be difficult in general. Some simple cases are 
given in the problem set. 

Arc Length s of a Curve 

The length (10) of a curve C is a constant, a positive number. But if we replace the fixed 
b in (10) with a variable f, the integral becomes a function of t, denoted by $(/) and called 
the arc length function or simply the arc length of C. Thus 


(ID 





Fig. 207. Length of a curve 
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EXAMPLE 6 


Here the variable of integration is denoted by 7 because t is now used in the upper limit. 

Geometrically, s(t Q ) with some t 0 > a is the length of the arc of C between the points 
with parametric values a and r 0 - The choice of a (the point s = 0) is arbitrary; changing 
a means changing s by a constant. 


Linear Element ds. If we differentiate (11) and square, we have 

/ ds \ 2 dr dr . , / dx \ 2 ( dy \ 2 / dz \ 2 

<“> u) + (i) + U)- 

It is customary to write 


(13*) dr = [dx, dy , dz] = dxi + dy j + dz k 

and 

(13) ds 2 = dr* dr = dx 2 + dy 2 + dz 2 . 


ds is called the linear element of C. 

Arc Length as Parameter. The use of s in ( 1 ) instead of an arbitrary t simplifies various 
formulas. For the unit tangent vector (8) we simply obtain 

(14) n(s) = r'(. s). 


Indeed, (r 7 ^)) = ( ds/ds ) = 1 in (12) shows that r'(s) is a unit vector. Even greater 
simplifications due to the use of s will occur in curvature and torsion (below). 


Circular Helix. Circle. Arc Length as Parameter 

The helix r(r) = [a cos /, a sin /, ct J in (5) has the derivative r '(/) = [-a sin /, a cos t. c]. Hence r' *r' = a 2 + c 2 , 
a constant, which we denote by K 2 . Hence the integrand in (1 1) is constant, equal to K, and the integral is s = Kt. 
Thus / = s/K, so that a representation of the helix with the arc length s as parameter is 

(15) r *(s) = j = | a cos ~ , a sin ~ J , K = Vr ? 2 + c 2 . 


A circle is obtained if we set c = 0. Then K = a,t = sfa , and a representation with arc length s as parameter is 


( s \ s s 

— I = a cos — , a sin — . 

a J a a 


Curves in Mechanics. Velocity. Acceleration 

Curves play a basic role in mechanics, where they may serve as paths of moving bodies. 
Then such a curve C should be represented by a parametric representation r (t) with time 
t as parameter. The tangent vector (7) of C is then called the velocity vector v because, 
being tangent, it poin ts in t he instantaneous direction of motion and its length gives the 
speed |v| = |r'| = Vr'*r' = dsklt\ see (12). The second derivative of r(/) is called the 
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EXAMPLE 7 


acceleration vector and is denoted by a. Its length |a| is called the acceleration of the 
motion. Thus 

(16) v(r) = r '(r) f a (t) = v'(/) = r "(t). 

Tangential and Normal Acceleration. Whereas the velocity vector is always tangent 
to the path of motion, the acceleration vector will generally have another direction, so that 
it will be of the form 


(17) 


^ ^tan 


+ a 


norm» 


where the tangential acceleration vector a lan is tangent to the path (or, sometimes, 0) 
and the normal acceleration vector a norm is normal (perpendicular) to the path (or, 
sometimes, 0). 

Expressions for the vectors in (17) are obtained from (16) by the chain rule. We first 
have 


dr dr ds ds 

v(r) = — = — — = uC s) — 
dt ds dt dt 


where u(.v) is the unit tangent vector (14). Another differentiation gives 

d\ d ( ds\ du ( ds \ 2 d 2 s 

(18) a® = Tl = J t (»<*) -J ) - * ( ¥ ) + "W ■ 


Since the tangent vector u ( j ) has constant length (one), its derivative dulds is perpendicular 
to u (s) (by Example 4 in Sec. 9.4). Hence the first term on the right of (18) is the normal 
acceleration vector, and the second term on the right is the tangential acceleration vector, 
so that (18) is of the form (17). 

Now the length of a tan is the projection of a in the direction of v, given by (11) in 
Sec. 9.2 with b = v; that is, |a tan | = a«v/|v|. Hence a tan is this expression times the unit 
vector (l/|v|)v in the direction of v; that is. 


(18*) 


a«v 

**tan Also, ^norm <*tan* 


Let us consider two basic examples, involving centripetal and centrifugal accelerations 
and Coriolis acceleration , as it occurs, for instance, in space travel. 


Centripetal Acceleration. Centrifugal Force 

The vector function 

r(/) = [ R cos (ot . R sin wt] = R cos (Dt i + R sin (Dt j (Fig. 208) 

(with fixed i and j) represents a circle C of radius R with center at the origin of the .vy-plane and describes the 
motion of a small body B counterclockwise around the circle. Differentiation gives the velocity vector 

v = r ; = [ — sin Rw cos cot] = — Rw sin wt i + Ra) cos a>t j (Fig. 208). 

v is tangent to C. Its magnitude, the speed, is 

|v| = |r'| = Vr' t' = R(d. 
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EXAMPLE 8 



Fig. 208. Centripetal acceleration a 


Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals 
( o , so that it is constant, too. Differentiating the velocity vector, we obtain the acceleration vector 

(19) a = v' = [—R(x) Z cos a >/, — Rco 2 sin <ot] = —Ru> 2 cos (ot\ — Ro? sin (ot j. 

This shows that a = -o? r (Fig. 208), so that there is an acceleration toward the center, called the centripetal 
acceleration of the motion. It occurs because the velocity vector is changing direction at a constant rate. Its 
magnitude is constant, |a| = w 2 |r| = co 2 R. Multiplying a by the mass m of B, we get the centripetal force wa. 
The opposite vector -/?ia is called die centrifugal force. At each instant these two forces are in equilibrium. 

We see that in this motion the acceleration vector is normal (perpendicular) to C; hence there is no tangential 
acceleration. M 

Superposition of Rotations. Coriolis Acceleration 

A projectile is moving with constant speed along a meridian of the rotating earth in Fig. 209. Find its acceleration. 



Fig. 209. Example 8. Superposition of two rotations 


Solution . Let .V, y, z be a fixed Cartesian coordinate system in space, with unit vectors i, j, k in the directions 
of the axes. Let the earth, together with a unit vector b, be rotating about the z-axis with angular speed <o > 0 
(see Example 7). Since b is rotaing together widi the earth, it is of the form 

b(r) = cos cat i + sin <ot j. 

Let the projectile be moving on the meridian whose plane is spanned by b and k (Fig. 209) with constant angular 
speed y > 0. Then its position vector in terms of b and k is 


r(/) = R cos yt b(/) + R sin yt k 


(R = Radius of the earth). 
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This is the model. The rest is calculation. The result will be unexpected and highly relevant for air and space 
travel. The first and second derivatives of b with respect to t are 

b / (r) = —co sin cot i + co cos cot j 

(20) 

b ;, (/) = -co 2 cos cot i — co 2 sin cot j = -o^bO). 

The first and second derivatives of r(r) with respect to / are 

v = r^/) = R cos yt b' — yR sin yt b + yR cos yt k 

(21) a = \ = R cos yt b” — 2 yR sin yt b f — y 2 R cos yt b — y 2 /? sin yt k 

= R cos yt b" - 2 yR sin yt b' - y 2 r. 

By analogy with Example 7 and because of b" = -a> 2 b in (20) we conclude that the first term in a (involving 
co in b w !) is the centripetal acceleration due to the rotation of the earth. Similarly, the third term in the last line 
(involving y!) is the centripetal acceleration due to the motion of the projectile on the meridian M of the rotating 
earth. 

The second, unexpected term -2y R sin yt b' in a is called the Coriolis acceleration 3 (Fig. 209) and is due 
to the interaction of the two rotations. On the Northern Hemisphere, sin yt > 0 (for t > 0; also y > 0 by 
assumption), so that a cor has the direction of — b # . dial is, opposite to the rotation of the earth. |a cor | is maximum 
at the North Pole and zero at the equator. The projectile B of mass m 0 experiences a force -Wo a cor opposite 
to woa cor , which tends to let B deviate from M to the right (and in the Southern Hemisphere, where sin yt < 
0. to the left). This deviation has been observed for missiles, rockets, shells, and atmospheric air flow. ■ 

Curvature and Torsion. Optional 

This optional portion of the section completes our discussion of curves from the viewpoint 
of vector calculus. 

The curvature k(s) of a curve C: r(s) (s the arc length) at a point P of C measures the 
rate of change |u'(s)| of the unit tangent vector u(s) at P. Hence k(s) measures the deviation 
of C at P from a straight line (its tangent at P). Since u(s) = r'(s), the definition is 

(22) k(s) = |u'(.s)| = |r"(j)| (' = dlds). 

The torsion t(s) of C at P measures the rate of change of the osculating plane O (the 
plane spanned by u and u\ see Fig. 210) of C at P . Hence r(^) measures the deviation 


re 

E 

o 


£ 



Fig. 210. Trihedron. Unit vectors u, p, b and planes 


3 GUSTAVE GASPARD CORIOLIS (1792-1843), French engineer who did research in mechanics. 



398 


CHAP. 9 Vector Differential Calculus. Grad, Div, Curl 


of C at P from a plane (from O at P). Now the rate of change is also measured by the 
derivative b' of a normal vector b at O. By the definition of vector product, a unit normal 
vector of O is b = u x (1/k)u' = uxp, where p = (1/k)u ; is called the unit principal 
normal vector and b is called the unit binormal vector of C at P; see Fig. 210. Here we 
must assume that k =£ 0; hence k > 0. The absolute value of the torsion is now defined by 

(23*) Ml = |b'(s)|. 

Whereas k(s ) is nonnegative, it is practical to give the torsion a sign, motivated by 
“right-handed” and “left-handed” (see Figs. 202, 203). This needs a little further 
calculation. Since b is a unit vector, it has constant length. Hence b' is perpendicular to 
b (see Example 4 in Sec. 9.4). Now b' is also perpendicular to u because by the definition 
of vector product we have b*u = 0, b*u'=0. This implies 

(b«u)' = 0; that is, b'*u + b*u' = b' # u + 0 = 0. 

Hence if b' =£0 at P, it must have the direction of p or -p, so that it must be of the form 
b' = — rp. Taking the dot product of this by p and using p # p = 1 gives 

(23) t(s) = -pCs)*b'(.y). 

The minus sign is chosen to make the torsion of a right-handed helix positive and that of 
a left-handed helix negative (Figs. 202, 203). The orthonormal vector triple u, p, b is 
called the trihedron of C. Figure 210 also shows the names of the three straight lines in 
the directions of u, p, b, which are the intersections of the osculating plane, the normal 
plane, and the rectifying plane. 




1-10 


PARAMETRIC REPRESENTATIONS 


Find a parametric representation of the following curves. 

1. Circle of radius 3, center (4, 6) 

2. Straight line through (5, 1 , 2) and ( 1 1 , 3, 0) 

3. Straight line through (2, 0, 4) and (—3, 0, 9) 

4. Straight line y = 2.v + 3, z = 7 a* 

5. Circle .y 2 + 4y + z 2 = 5, x = 3 

6. Ellipse .v 2 + y 2 = 1 , z — v 

7. Straight line through ( a , /?, c) and (a + 3, b — 2, c + 5) 

8. Intersection of a* + y — z = 2, 2a* — 5y + z = 3 


9. Circle 5A* 2 + y 2 = 1, z = y 
10. Helix a * 2 + y 2 = 9. z = 4 arctan (v/x) 


11-18 What curves are represented as follows? 

11. [2 + r cos 4/, 6 4- r sin At, 2t] 

12. [4 - 2/, 8/, -3 + 5/] 

13. [2 + cos 3/, — 2 + sin 3/, 5] 

14. [/, / 2 , / 3 ] 


15. [Vcos /, Vsin /, O) (“. Latm * curve”) 

16. [cosh /, sinh /, 0] 

17. [/, 1 //, 0] 

18. [1,5 + /, -5 + I//1 

19. Show that setting t = — /*• reverses the orientation of 
[ 1 a cos /, a sin /, 0]. 

20. If we set / = e l in Prob. 12, do we get the entire line? 
Explain. 

21. CAS PROJECT. Curves. Graph the following more 
complicated curves. 

(a) r(/) = [2 cos t + cos 2/, 2 sin / — sin 2/J 
( 1 Steiner r s hypocycloid) 

(b) r(/) = [cos t + k cos 2/, sin t — k sin 2t] with 

k = 10, 2, 1,|, 0, -§, -1 

(c) r(/) = [cos /, sin 5/] (a Lissajous cuive) 

(d) r(/) = [cos /, sin kt]. For what k* s will it be 
closed? 

(e) r(t) = [/? sin m + coRt , R cos cot + /?J (cycloid). 
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22-25 


TANGENT 


Given a curve C: r(f), find a tangent vector r'(/), a unit 
tangent vector u'(/), and the tangent of C at P. Sketch the 
curve and the tangent. 

22. r(/) = [u / 2 , 0], P: (2, 4, 0) 

23. r(/) = [5 cos /, 5 sin r, 0], P: (4, 3, 0) 

24. r(/) = [3 cos r, 3 sin /, 4/], P : (3, 0, 87t) 

25. r(f) = [cosh f, sinh tl P: (§, §) 


1 26-28] LENGTH 

Find the length and sketch the curve. 

26. Circular helix r(/) = [2 cos r, 2 sin t , 6/] from 

(2, 0, 0) to (2, 0, 24^r) 

27. Catenary r(7) = [t, cosh t] from t = 0 to t = 1 

28. Hypocycloid r(/) = [a cos 3 1 , a sin 3 r], total length 


29. Show that (10) implies t = f Vl + y' 2 dx for the 

J a 

length of a plane curve C: y = /(a*), z = 0, a ^ x ^ b. 


30. Polar coordinates p = Va 2 -I- y 2 , 6 = arctan (y/x) 
give€ = J Vp 2 +■ p 2 d0, where p = dpfdO. Derive 


this. Use it to find the total length of the cardioid 
p = fl(l - cos 0). Sketch this curve. Hint. Use (10) 
in App. 3.1. 


31. CAS PROJECT. Polar Representations. Use your 
CAS to graph the following famous curves 4 and 
investigate their form depending on parameters a and b. 


p = a 6 Spiral of Archimedes 
p = ae bu Logarithmic spiral 
2a sin 2 0 

p= — Cissoid of Diodes 

H cos 0 J 


p = + b Conchoid of Nicomedes 

cos 6 

p = afO Hyperbolic spiral 
3 a sin 26 

P = — Folium of Descartes 
H cos 3 e + sin 3 0 

sin 30 

p — 2a . — — Mac la u r in *s trisectrix 

H sin 26 

p — 2a cos 6 + b Pascal's snail 


32-34 


CURVES IN MECHANICS 


Velocity and Acceleration. Forces on moving objects 
(cars, airplanes, etc.) require that the engineer knows 
corresponding tangential and normal accelerations. Find 
diem, along with the velocity and speed, for the following 
motions. Sketch the padi. 


32. r(0 = [4/, -3/, 0] 

33. r(f) = [f, t 2 , 0] 


34. r(/) = [cos /, 2 sin /, 0] 


35. (Cycloid) Given 

r(/) = ( R sin cot + cufl/)i + ( R cos cot + R) j. 

This cycloid is the path of a point on the rim of a wheel 
of radius R that rolls without slipping along the jc-axis. 
Find v and a at the maximum y-values of the curve. 

36. CAS PROJECT. Paths of Motions. Gear 
transmissions and other engineering constructions 
often involve complicated paths whose study is greatly 
facilitated by the use of a CAS. To grasp the idea, graph 
the following paths and find die velocity, the speed, 
and the tangential and normal accelerations. 

(a) r(r) = [2 cos t + cos 2/, 2 sin f — sin 2/] 
(i Steiner's hypocycloid) 

(b) r(f) = [cos t + cos 2 1, sin / - sin 2 1] 

(c) r (0 = [cos t , sin 2/, cos 2t] 

(d) r(r) = [ct cos /, ct sin t , ct] (c ^ 0) 

37. (Sun and earth) Find the acceleration of the earth 
toward the sun from (19) and the fact that the earth 
revolves about the sun in a nearly circular orbit with 
an almost constant speed of 30 km/sec. 

38. (Earth and moon) Find the centripetal acceleration of 
the moon toward the earth, assuming that the orbit 
of the moon is a circle of radius 239,000 miles 
= 3.85 • 10 8 m, and the time for one complete 
revolution is 27.3 days = 2.36* 10 6 sec. 

39. (Satellite) Find the speed of an artificial earth satellite 
traveling at an altitude of 80 miles above the earth’s 
surface, where g = 3 1 ft/sec 2 . (The radius of the earth 
is 3960 miles.) 

40. (Satellite) A satellite moves in a circular orbit 
450 miles above the earth’s surface and completes 
I revolution in 100 min. Find the acceleration of 
gravity at the orbit from these data and from the radius 
of the earth (3960 miles). 


4 Named after ARCHIMEDES (c. 287-212 B.C.), DESCARTES (Sec. 9.1), DIOCLES (200 B.C.), 

MACLAURIN (Sec. 15.4), NICOMEDES (250? b.c.) &TIENNE PASCAL (1588-1651), father of BLAISE 

PASCAL (1623-1662). 
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41-50 


CURVATURE AND TORSION 


41. Show that a circle of radius a has curvature 1 fa. 

42. Using (22), show that if C is represented by r (t) with 
arbitrary t, then 


(22*) *(r) = 


V(r'*r / )(r"»r") - (r'-r ") 2 
(r'-r ') 3 ' 2 


43. Using (22*), show that for a curve y = /(a*) in the 
xy-plane. 


< 22 **> «<*>=(, (/=f.eic.). 

44. Using b = uxp and (23), show that 

(23**) t(s) = (u p p') = (r' r" r "Vk 2 


(k > 0 ). 


45. Show that the torsion of a plane curve (with k > 0) is 
identically zero. 

46. Show that if C is represented by r(/) with arbitrary 
parameter /, then, assuming k > 0 as before, 


47. 


48. 


49. 


50. 


(23***) r(/) = 


(r' 


r'") 


, ( / W // My , / //v 2 

(r *r )(r *r ) - (r *r ) 

Find the torsion of C: r (f) = [/, / 2 , r 3 ] (which looks 
similar to the curve in Fig. 210). 

(Helix) Show that the helix [a cos /, a sin r, ct ] can 
be represented by [a cos (s/K). a sin (s/K), csIK ], 


where K = Vr/ 2 + c 2 and 5 is the arc length. Show 
that it has constant curvature k = a IK 2 and torsion 
r = c/K 2 . 

Obtain k and r in Prob. 48 from (22*) and (23***) and 
the original representation in Prob. 48 with parameter /. 
(Frenet 5 formulas) Show that 
u ; = Kp, p' = — ku + rb, b ; = — rp. 


9.6 Calculus Review: 

Functions of Several Variables. 1 ptional 

Curves required vector functions of a single variable x or s , and we now proceed to 
vector functions of several variables, beginning with a review from calculus. Go on to 
the next section, consulting this material only when needed . (We include this short 
section to keep the book reasonably self-contained. For partial derivatives see 
App. A3.2.) 

Chain Rules 

Figure 211 shows the notations in the following basic theorem. 



5 JEAN-FR£d£RIC FRENET (18 16-1900), French mathematician. 
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THEOREM 1 


Chain Rule 

Let w = f(x, v, z) be continuous and have continuous first partial derivatives in 
a domain D in xyz-space. Let x = x(u, v), y = y(u , u), z = z(u, v) be functions 
that are continuous and have first partial derivatives in a domain B in the 
uv-plane, where B is such that for eveiy point (it, v) in B, the corresponding point 
[x(m, v) 9 y(u, v ), z(u, i>)] lies in D. See Fig. 211. Then the function 

w = f(x(u 9 v ), y(u, v ), z(u 9 v)) 

is defined in B, has first partial derivatives with respect to u and v in B , and 

dw dw dx dw dy dw dz 

du dx du dy du dz du 

( 1 ) 

dw _ dw dx dw dy dw dz 

dl) dx dv dy du dz dv 


In this theorem, a domain D is an open connected point set in jtyz-space, where “connected” 
means that any two points of D can be joined by a broken line of finitely many linear 
segments all of whose points belong to D. “Open” means that every point P of D has a 
neighborhood (a little ball with center P ) all of whose points belong to D. For example, 
the interior of a cube or of an ellipsoid (the solid without the boundary surface) is a domain. 

In calculus, x, y, z are often called the intermediate variables, in contrast with the 
independent variables w, v and the dependent variable w . 

Special Cases of Practical Interest 

If w = f(x , y) and x = x(u, v ), y = y(u 9 v) as before, then (1) becomes 

dw dw dx dw dy 

du dx du dy du 

( 2 ) 

dw _ dw dx dw dy 

dv dx dv dy dv ’ 


If vv = /( x, y, z) and x = x(t), y = y(r), z = z(t\ then (1) gives 


(3) 


dw ^ dw dx dw dy aw dz 

dt dx dt dy dt dz dt ’ 


If w = f(x , y) and x = x(t), y = y(t), then (3) reduces to 
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EXAMPLE 1 


THEOREM 2 


Finally, the simplest case w = /(*), x — x{t) gives 

dw _ dw dx 

(5) ~dt “ ~dx~dt' 

Chain Rule 

If w - x 2 - y 2 and we define polar coordinates r, 0 by x = r cos 0, y = r sin 0, then (2) gives 

— = 2* cos 0 - 2y sin 0 = 2rcos 0 - 2rsin 0 = 2rco$20 
dr 

= 2*(-r sin 0) — 2>>(r cos 0) = —2 r 2 cos 0 sin 0 - 2r 2 sin 0cos 0 = -2r 2 sin 20. ■ 


Mean Value Theorems 


Mean Value Theorem 

Let f(x, y, z) be continuous and have continuous first partial derivatives in a 
domain D in xyz-space . Let P 0 : (* 0 , y 0 , Zq) and P : (x 0 + h, y 0 + k, z 0 4* /) be 
points in D such that the straight line segment PqP joining these points lies entirely 
in D. Then 


(6) f(x 0 + K J>o + z 0 + /) - f(x 0i y 0i z 0 ) = h— + k — + / — , 

ox dy dz 

the partial derivatives being evaluated at a suitable point of that segment . 



Fig. 212. Mean value theorem for a function of two variables [Formula (7)] 


Special Cases 

For a function f(x , y ) of two variables (satisfying assumptions as in the theorem), formula 
(6) reduces to (Fig. 212) 


(7) 


f(x 0 + h,y 0 + k) - /(. Xq, y 0 ) = 

dx dy 
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and for a function f(x) of a single variable, (6) becomes 
(8) f(x 0 + h)-f(x 0 ) = h^, 

where in (8), the domain D is a segment of the A*-axis and the derivative is taken at a 
suitable point between x 0 and x 0 + h. 


P RO BLE M -fr E.T_3L - 6 _ _ 


1-5 


DERIVATIVE 


Find dw/dt by (3) or (4). Check the result by substitution 
and differentiation. (Show the details.) 

1. w = Va 2 + y 2 , a = e 2t \ y — e~ 2t 

2. w = y/x, x = g(/), y = h(t) 

3. w = x y , x = cosh f, y = sinh / 

4. w = xy + yz + zx, x = 2 cos /, y = 2 sin /, z = 5/ 

5. w = (a 2 + y 2 + z 2 ) 3 , a = / 2 , y = f 4 , z = / 2 


6-9 


PARTIAL DERIVATIVES 


Find dw/du and dw/dv by (1) and (2). Check the result by 
substitution and differentiation. (Show the details.) 


6. w — 4a 2 — 4y 2 , a = u + 2v, y = 2 u — u 

7. w = x 2 y 2 , x — e u cos u, y = e u sin v 

8. \v = a 4 - 4A 2 y 2 + y 4 , a = uv, y = u/v 


9. = 1/(a 2 + y 2 + z 2 ), a = m 2 + u 2 , y = u 2 - v 2 y 

z = 2uu 

10. (Partial derivatives on a surface) Let w = /(a, y, z), 
and let z = g( a, y) represent a surface S in space. Then 
on S, the function becomes 

w(a, y) = /[a, y, g( a, y)]. 

Show that its partial derivatives are obtained from 

c )w _ ()f + (if fig dw df ^ df fig 

fix dx fiz fix ’ dy dy fiz dy 

Iz = g( a, y)]. 

Apply this to / = a 3 -r y 3 + z 2 . g = a 2 + y 2 and 
check by substitution and direct differentiation. (The 
general formula will be needed in Sec. 10.9.) 


9.7 Gradient of a Scalar Field. 

Directional Derivative 

We shall see that some of the vector fields in applications — not all of them! — can 
obtained from scalar fields. This is a considerable advantage because scalar fields can 
handled more easily. The relation between these two kinds of fields is obtained by the 
“gradient,” which is thus of great practical importance. 


DEFINITION 1 


Gradient 

The gradient of a given scalar function f(x, y, z) is denoted by grad / or Vf (read 
nabla /) and is the vector function defined by 


( 1 ) 


grad / = V/ = 



Here a, y, z are Cartesian coordinates in a domain in 3-space in which / is defined 
and differentiable. (For curvilinear coordinates see App. 3.4.) 


S’ S’ 
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For instance, if /( x, y, z) = 2 y 3 -l* 4xz + 3x, then grad / = [4z + 3, 6y 2 , 4x]. 

The notation V/ is suggested by the differential operator V (read nabla ) defined by 


(1*) 


V = 


d d 

— i + — j 
dx dy J 



Gradients are useful in several ways, notably in giving the rate of change of /( x, y, z) 
in any direction in space, in obtaining surface normal vectors, and in deriving vector fields 
from scalar fields, as we are going to show in this section. 

Directional Derivative 

From calculus we know that the partial derivatives in (1) give the rates of change of 
f(x, y, z ) in the directions of the three coordinate axes. It seems natural to extend this and 
ask for the rate of change of / in an arbitrary direction in space. This leads to the following 
concept. 


DEFINITION 2 


Directional Derivative 

The directional derivative D h f or df/ds of a function /(x, y, z) at a point P in the 
direction of a vector b is defined by (see Fig. 213) 


( 2 ) 


D b f = -y- = lim 
ds s— >o 


m - m 

s 


Here Q is a variable point on the straight line L in the direction of b, and |^| is the 
distance between P and Q. Also, s > 0 if Q lies in the direction of b (as in 
Fig. 213), s < 0 if Q lies in the direction of —b, and s = 0 if Q = P. 



The next idea is to use Cartesian ^-coordinates and for b a unit vector. Then the line L 
is given by 

(3) r {s) = x(s)i 4- y(s ) j + z(s ) k = p 0 + *b (|b| = 1) 

where p 0 the position vector of P. Equation (2) now shows that D b f = df/ds is the 
derivative of the function f(x(s), y(s ), z(s)) with respect to the arc length s of L. Hence, 
assuming that / has continuous partial derivatives and applying the chain rule [formula 
(3) in the previous section], we obtain 


n . _ df W / 
D b f = ~r = —x 
ds dx 


,9f , df 

H v H 

dy ' dz 


/ 

z 


(4) 
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where primes denote derivatives with respect to s (which are taken at s = 0). But here t 
differentiating (3) gives r' = x ; i + y'j + z k = b. Hence (4) is simply the inner product 
of grad / and b [see (2), Sec. 9.2]; that is, 

(5) D h f = = b»grad/ (|b| = 1). 

as 

ATTENTION! If the direction is given by a vector a of any length (=£ 0), then 

df 1 

(5*) A»/ = ~r = TT a# grad /. 

ds |a| 


EXAMPLE 1 Gradient. Directional Derivative 

Find the directional derivative of /(.v, y, z ) = 2x 2 + 3 y 2 + z 2 at P : (2, 1. 3) in the direction of a = [1,0. —2]. 

Solution . grad f = [4.v, 6y. 2zJ gives at P the vector grad f(P) = [8, 6, 61. From this and (5*) we obtain, 
since |a| = V5. 

DJ(P) = 11.0, —21* [8, 6. 61 = -^= (8 + 0 - 12) = - = -1.789. 

The minus sign indicates that at P the function / is decreasing in the direction of a. ■ 

Gradient Is a Vector. Maximum Increase 

grad / in (1) looks like a vector — after all, it has three components! But to prove that it 
actually is a vector, since it is defined in terms of components depending on the Cartesian 
coordinates, we must show that grad f has a length and direction independent of the choice 
of those coordinates. In contrast, [ df/dx , 2 df/dy\ df/dzl also looks like a vector but 
does not have a length and direction independent of the choice of Cartesian coordinates. 

Incidentally, the direction makes the gradient eminently useful: grad / points in the 
direction of maximum increase of /. 


THEOREM 


Vector Character of Gradient. Maximum Increase 

Lei f(P) = fix , y, z) be a scalar Junction having continuous first partial derivatives 
in some domain B in space . Then grad / exists in B and is a vector ; that is } its length 
and direction are independent of the particular choice of Cartesian coordinates. If 
grad f(P) ^ 0 at some point P. it has the direction of maximum increase of f at P. 


PROOF From (5) and the definition of inner product [(1) in Sec. 9.2] we have 

(6) D h f = |b| |grad /| cos y = |grad f\ cos y 

where y is the angle between b and grad /. Now / is a scalar function. Hence its value 
at a point P depends on P but not on the particular choice of coordinates. The same holds 
for the arc length s of the line L in Fig. 213, hence also for D h f . Now (6) shows that D h f 
is maximum when cos y — 1, y — 0, and then D b f = |grad /|. It follows that the length 
and direction of grad f are independent of the choice of coordinates. Since y = 0 if and 
only if b has the direction of grad /, the latter is the direction of maximum increase of 
/ at P , provided grad / 0 at P. ■ 




406 


CHAP. 9 Vector Differential Calculus. Grad, Div, Curl 


THEOREM 


EXAMPLE 2 


Gradient as Surface Normal Vector 

Gradients have an important application in connection with surfaces, namely, as surface 
normal vectors, as follows. Let 5 be a surface represented by f(x, y 7 z) = c = const , where 
/ is differentiable. Such a surface is called a level surface of /, and for different c we get 
different level surfaces. Now let C be a curve on S through a point P of S. As a curve in 
space, C has a representation r (t) = [a*(0, y(t) 7 z(t)]. For C to lie on the surface S, the 
components of r(/) must satisfy /(*, y, z) = c, that is, 

(7) f(x(t) 7 y(i), z(t)) = c. 

Now a tangent vector of C is r f (t) = [jc'(r), y'(0, z(t)\ . And the tangent vectors of all 
curves on S passing through P will generally form a plane, called the tangent plane of S 
at P. (Exceptions occur at edges or cusps of £, for instance, for the cone in Fig. 215 at 
the apex.) The normal of this plane (the straight line through P perpendicular to the tangent 
plane) is called the surface normal to S at P. A vector in the direction of the surface 
normal is called a surface normal vector of S at P. We can obtain such a vector quite 
simply by differentiating (7) with respect to t. By the chain rule, 

+ Y~y' + Y~z' = (grad/)*r' = 0. 
dx dy dz 

Hence grad / is orthogonal to all the vectors r' in the tangent plane, so that it is a normal 
vector of S at P. Our result is as follows (see Fig. 214). 

Tangent plane /*= const v 



Fig. 214. Gradient as surface normal vector 


Gradient as Surface Normal Vector 

Let f be a differentiable scalar function in space . Let f(x, y,z) — c — const represent 
a siuface S . Then if the gradient of f at a point P of S is not the zero vector ; it is 
a normal vector of S at P. 


Gradient as Surface Normal Vector. Cone 

Find a unit normal vector n of the cone of revolution z 2 = 4(.r 2 + y 2 ) at the point P; (1, 0, 2). 

Solution. The cone is the level surface / = 0 of /(; r, y\ z) = 4(* 2 + y 2 ) - z 2 . Thus (Fig. 215), 

grad / = [8Lv, 8 y t -2 zl grad /(F) = [8, 0, -4] 

-^s]' 

n points downward since it has a negative z-component. The other unit normal vector of the cone at P is -n. ■ 




SEC 9.7 Gradient of a Scalar Field Directional Derivative 


407 



Fig. 215. Cone and unit normal vector n 


Vector Fields That Are Gradients of Scalar Fields 
(“Potentials”) 

At the beginning of this section we mentioned that some vector fields have the advantage 
that they can be obtained from scalar fields, which can be handled more easily. Such a 
vector field is given by a vector function v(P), which is obtained as the gradient of a scalar 
function, say, v(P) = grad /(P). The function f(P) is called a potential function or a 
potential of v(P). Such a v(P) and the corresponding vector field are called conservative 
because in such a vector field, energy is conserved; that is, no energy is lost (or gained) 
in displacing a body (or a charge in the case of an electrical field) from a point P to another 
point in the field and back to P. We show this in Sec. 10.2. 

Conservative fields play a central role in physics and engineering. A basic application 
concerns the gravitational force (see Example 3 in Sec. 9.4) and we show that it has a 
potential which satisfies Laplace’s equation, the most important partial differential 
equation in physics and its applications. 


THEOREM 3 


Gravitational Field. Laplace’s Equation 

The force of attraction 


( 8 ) 



x - x 0 y “ yo z- Zo 
r 3 ’ r 3 ’ r 8 


between two particles at points P 0 : (a' 0 , y 0 , z 0 ) and P: (jc, y, z) (as given by Newton’s 
law of gravitation) has the potential /(jc, y, z) = c/r ; where r (> 0) is the distance 
between P 0 and P. 

Thus p = grad f = grad (c/r). This potential f is a solution o/Laplace’s equation 


(9) 



d 2 f d 2 f 
dy 2 dz 2 


= 0 . 


[V 2 / (read nabla squared f) is called the Laplacian of /.] 
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PROOF 


That distance is r = ((x - x 0 ) 2 + (y - y 0 ) 2 + (z~ z 2 f) m . The key observation now is 
that for the components of p = [p 1( p 2 , P3] we obtain by partial differentiation 


(10a) 
and similarly 

(10b) 


±(i)_ 

dx \r) 


—2(x - x 0 ) 


x — x 0 


2[(x - x 0 f + (y - y 0 ) 2 + (z - Zo) 2 ] 

_d_ /j_\ y ~ y 0 
dy \r) r 3 ' 

J_ ( l\ _ Z-Zo 

dz W " r 3 ' 


From this we see that, indeed, p is the gradient of the scalar function f = c/r. The second 
statement of the theorem follows by partially differentiating (10), that is. 


— W — 

dx 2 \>7 r 3 

m _j_ 

ay 2 \r) r 3 

JL (L\ - _J_ 

dz 2 \rj r 3 


3(x - x 0 ) 2 


3 (y - 3’o) Z 


1 3(z - zo) 2 

T K 


and then adding these three expressions. Their common denominator is r 5 . Hence the three 
terms — 1/r 3 contribute — 3/* 2 to the numerator, and the three other terms give the sum 

3(* - * 0 ) 2 + 3 (y - y 0 ) 2 + 3 {z - z 0 ) 2 = 3r 2 , 
so that the numerator is 0, and we obtain (9). ■ 

V 2 / is also denoted by A/. The differential operator 


(ID 


A a 2 d 2 d 2 

V 2 = A = —2 + TT + TT 
dx 2 dy 2 dz 2 


(read “nabla squared” or “delta”) is called the Laplace operator. It can be shown that 
the field of force produced by any distribution of masses is given by a vector function 
that is the gradient of a scalar function /, and / satisfies (9) in any region that is free of 
matter. 

The great importance of the Laplace equation also results from the fact that there are 
other laws in physics that are of the same form as Newton’s law of gravitation. For instance, 
in electrostatics the force of attraction (or repulsion) between two particles of opposite (or 
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like) charge Q x and Q 2 is 

k 

(12) p = — r (Coulomb’s law 6 ). 


Laplace’s equation will be discussed in detail in Chaps. 12 and 18. 

A method for finding out whether a given vector field has a potential will be explained 


in Sec. 9.9. 


T^I CALCULATION OF GRADIENTS 

Find V/. Graph some level curves / = const . Indicate V/ 
by arrows at some points of these curves. 

1. / = X 2 + .V 2 2. / = x 2 + y 2 

3 • f = ~ 4. / = x 4 + y 4 

5. / = (x - 2)(y + 2) 

6 . / = (x - 3 ) 2 + 0’ - l ) 2 

|7-I2| USE OF GRADIENTS. VELOCITY FIELDS 

Given the velocity potential / of a flow, find the velocity 
v = V/ of the flow and its value at P. Make a sketch of 
y(P). ' 

7. / = .v 2 + y 2 + z 2 , P: (3, 2, 2) 

8. / = In (a 2 4- v 2 ), P: (4, 3) 

9. / = cos x cosh y, P: ( 57 r. In 2) 

10 . / = a 2 4- 4y 2 4- 9 z 2 , P: (3, 2, 1) 

11. f = e x siny, P: (l t 77 ) 

12. / = ( a 2 4- y 2 4- z 2 r vz , P: (2, 1, 2) 

13—18 1 HEAT FLOW 

Experiments show that in a temperature field, heat flows in 
the direction of maximum decrease of temperature T. Find 
this direction in general and at a given point P. Sketch that 
direction at P as an arrow. 

13. T = x 2 - y 2 , P: (2, I) 

14. T = arctan — t P; (2 t 2) 

A* 

15. T = x 3 - 3xy 2 , P: (VH, V2) 

16. T = x/(x 2 + y 2 ), P: (4, 0) 

17. T = 3x 2 y - .v 3 , P: (4, -2) 

18. T = sin x cosh y, P: (|tt, In 5) 


1 19-24 1 ELECTRIC FORCE 

The force in an electrostatic field f(x 9 y, z) has the direction 
of the gradient of /. Find Vf and its value at P. 

19. / = (x - l) 2 - (y + l) 2 , P: (4, -3) 

20. f = y/(x 2 + y 2 ), P: (5. 3) 

21. / = x 2 - 2x - y 2 , P: (-2, 6) 

22. / = In (x 2 + y 2 ), P: (3, 3) 

23. f = (x 2 + y 2 + z 2 )“ 1/2 , />: (12, 0, 16) 

24. / = x 2 y - Jy». P: (2, 3) 

25. (Gradient) What does it mean if |grad f(P)\< |grad f(Q)\ 
at two points P and Q in a scalar field? 

26. (Landscape) If z(x, y) = 2000 - 4x 2 — y 2 [meters] 
gives the elevation of a mountain above sea level, what 
is the direction of steepest ascent at P: (3, —6)? What 
does the mountain look like? 

27-32 1 SURFACE NORMAL 

Find a normal vector of the surface at the given point P. 

27. ax + by 4- cz = d, any P 

28. x 2 + 3y 2 + z 2 = 28, P: (4, 1. 3) 

29. x 2 + y 2 = 25, P: (4, 3, 8) 

30. x 2 - y 2 + 4z 2 = 67, P: (-2, 1, 4) 

31. x 4 + y 4 + z 4 = 243. P: (3, 3, 3) 

32. z = x 2 + y 2 , P: (3, 4, 25) 

33-38 1 DIRECTIONAL DERIVATIVE 

Find the directional derivative of / at P in the direction 
of a. 

33. / = x 2 + y 2 - z. P: (1. 1. -2), a = [1. 1, 2] 

34. / = x 2 + y 2 + z 2 , P: (2, -2, 1), a = [- 1, -1, 0] 

35. / = xyz, P: (- 1, 1, 3), a = [1, -2, 2] 


e CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer. Coulomb’s law was 
derived by him from his own very precise measurements. 
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36. / = (x 2 + / + z 2 )- 1 ' 2 , P: (4, 2, -4), a = [1, 2, -2] 

37. / = <? x sin >>, />; (2, 0), a = [2, 3, 0] 

38. / = 4x 2 + y 2 + 9z 2 . P: (2, 4, 0), a = [-2. -4, 3] 

1 39-1 1 1 POTENTIALS 

for a given vector field — if they exist! — can be obtained by 
a method to be discussed in Sec. 9.9. In simpler cases, use 
inspection. Find a potential / = grad v for given v(a*, y, z). 

39. v = [3a, 5 y, —4 z] 

40. v = [ye*, e* r , 2 z] 

41. v = [4 a* 3 , 3y 2 , -6c] 

42. Project. Useful Formulas for Gradients and 
Laplacians. Prove the following formulas and give for 


each of them two examples showing when they are 
advantageous. 

V(fg) = / Vg + gVf 

V(/ B ) = nf'-'Vf 

V(//g) = (l/g 2 )(gV/ - /Vg) 

V 2 (/g) = gV 2 / + 2 V/*Vg + /V 2 g 

43. CAS PROJECT. Equipotential Curves. Graph some 
isotherms (curves of constant temperature) and 
indicate directions of heat flow by arrows when the 
temperature T(x, y) equals: 

(a) a * 3 — 2>xy 2 (b) sin a* sinhy (c) e x siny. 


9.8 Divergence of a Vector Field 

Vector calculus owes much of its importance in engineering and physics to the gradient, 
divergence, and curl. From a scalar field we can obtain a vector field by the gradient 
(Sec. 9.7). Conversely, from a vector field we can obtain a scalar field by the divergence, 
or another vector field by the curl (to be discussed in Sec. 9.9). These concepts were 
suggested by basic physical applications, as we shall see. 

To begin, let v(jc, y, z) be a differentiable vector function, where x, y, z are Cartesian 
coordinates, and let v l9 v 2 , v 3 be the components of v. Then the function 


( 1 ) 


dv ± dv 2 

div v = 1 

dx dy 


dv 3 

dz 


is called the divergence of v or the divergence of the vector field defined by v. For 
example, if 


v = [3 xz, 2xy\ ~yz 2 ] = 3*zi + 2Ayj — yz 2 k, then div v = 3z + 2a* — 2yz. 
Another common notation for the divergence is 

d i vv = 7. v =[A,±,±J. [ „ l , V2i „ 5 j 


dV 1 

dx 


dv 2 dv 3 

dy dz 


with the understanding that the “product” (dldx)Vi in the dot product means the partial 
derivative dvjdx, etc. This is a convenient notation, but nothing more. Note that V»v 
means the scalar div v, whereas V/ means the vector grad / defined in Sec. 9.7. 
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THEOREM 1 


EXAMPLE 1 


In Example 2 we shall see that the divergence has an important physical meaning. 
Clearly, the values of a function that characterizes a physical or geometric property must 
be independent of the particular choice of coordinates; that is, those values must be 
invariant with respect to coordinate transformations. Accordingly, the following theorem 
should hold. 


Invariance of the Divergence 

The divergence div v is a scalar function , that is , its values depend only on the 
points in space (and, of course, on v) but not on the choice of the coordinates in 
( 1 ), so that with respect to other Cartesian coordinates a**, y *, z* and corresponding 
components v x *, u 2 *> v 3* of\ 9 


( 2 ) 


div v = 


dv x * dv 2 * tag* 

dx* dy* dz* 


We shall prove this theorem in Sec. 10.7, using integrals. 

Presently, let us turn to the more immediate practical task of gaining a feel for the 
significance of the divergence as follows. Let /(a, y, z) be a twice differentiable scalar 
function. Then its gradient exists, 


v = gtad f = 


a/ df\ = a/. a/ . a/ k 

dx dy ’ dz _ dx dy ^ dz 


and we can differentiate once more, the first component with respect to a, the second with 
respect to y, the third with respect to z, and then form the divergence. 


d 2 f d 2 f 

div v = div (grad/) = 72+71 + 
ox dy 


a 2 / 

a* 2 ’ 


Hence we have the basic result that the divergence of the gradient is the Laplacian 
(Sec. 9.7), 

(3) div (grad /) = V 2 /. 


Gravitational Force. Laplace’s Equation 

The gravitational force p in Theorem 3 of the last section is the gradient of the scalar function fix, y, z) = c/r, 
which satisfies Laplaces equation V 2 / = 0. According to (3) this implies that div p = 0 (r > 0). M 

The following example from hydrodynamics shows the physical significance of the 
divergence of a vector field. (More physical details on this significance will be added in 
See. 10.8.) 
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EXAMPLE 2 


Flow of a Compressible Fluid. Physical Meaning of the Divergence 

We consider the motion of a fluid in a region R having no sources or sinks in R , that is, no points at which 
fluid is produced or disappears. The concept of fluid state is meant to cover also gases and vapors. Fluids in 
the restricted sense, or liquids (water or oil, for instance), have very small compressibility, which can be neglected 
in many problems. Gases and vapors have large compressibility; that is, their density p (= mass per unit volume) 
depends on the coordinates a\ y. z in space (and may depend on time t). We assume that our fluid is compressible. 

We consider the flow through a rectangular box B of small edges Ax, Ay, A z parallel to the coordinate axes 
(Fig. 216), (A is a standard notation for small quantities; of course, it has nothing to do with the notation for the 
Laplacian in (11) of Sec. 9.7.) The box B has the volume AV = Ax Ay Az. Let v = [v^ u 2 . u 3 ] = i>ii + v 2 j + u 3 k 
be the velocity vector of the motion. We set 

(4) u = pv = l« lt « 2 » << 3 ] = Mj i + m 2 J + « 3 k 

and assume that u and v are continuously differentiable vector functions of a*, y, z, and t (that is, they have first 
partial derivatives which are continuous). Let us calculate the change in the mass included in B by considering 
the flux across the boundary, that is. the total loss of mass leaving B per unit time. Consider the flow through 
the left of the three faces of B that are visible in Fig. 216, whose area is Ax A z. Since the vectors t>| i and u 3 k 
are parallel to that face, the components and u 3 of v contribute nothing to this flow. Hence the mass of fluid 
entering through that face during a short time interval At is given approximately by 

(pv 2 ) y Ax Az At = (u 2 )y Ax Az At , 

where the subscript y indicates that this expression refers to the left face. The mass of fluid leaving the box 
B through the opposite face during the same time interval is approximately (u 2 )y+±y Ax Az At. where the 
subscript y -I- Ay indicates that this expression refers to the right face (which is not visible in Fig. 216). The 
difference 


A U9 

Au 2 Ax Az At = AV At 

Ay 


[Am 2 (m 2 )j/+a y (m 2 )^] 


is the approximate loss of mass. Two similar expressions are obtained by considering the other two pairs of 
parallel faces of B. If we add these three expressions, we find that the total loss of mass in B during the time 
interval A I is approximately 


where 


Ami Au2 ^ A»3 \ 

Ax Ay Az / 


AV At, 


A"i = - (Mi)* 


and Ah 3 = (m 3 ) z+A2 - ( u z ) z . 


This loss of mass in B is caused by the time rate of change of the density and is thus equal to 

bp 

— ~ Ay At. 
dr 



Fig. 216. Physical interpretation of the divergence 
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If we equate both expressions, divide the resulting equation by AV At. and let Ax. Ay. Az. and At approach zero, 
then we obtain 


div u = div (pv) = 

Ot 

or 


(5) — + div (pv) = 0. 

ot 

This important relation is called the condition for the conservation of mass or the continuity equation of a 
compressible fluid flow. 

If the flow is steady, that is. independent of time, then bplht = 0 and the continuity equation is 

(6) div (pv) = 0. 

If the density p is constant, so that the fluid is incompressible, then equation (6) becomes 

(7) div v = 0. 

This relation is known as the condition of incompressibility. It expresses the fact that the balance of outflow 
and inflow for a given volume element is zero at any time. Clearly, the assumption that the flow has no sources 
or sinks in R is essential to our argument. 

From this discussion you should conclude and remember that, roughly speaking, the divergence measures 
outflow minus inflow . H 

Comment. The divergence theorem of Gauss, an integral theorem involving the 
divergence, follows in the next chapter (Sec. 10.7). 


! PROBLEM SET 9.8 


[nt] calculation of the divergence 

Find the divergence of the following vector functions. 

1. [a 3 + y 3 , 3 av 2 , 3zy 2 ] 

2. [e 2x cos 2y. e 2x sin 2y, 5e 2z ] 

3. [x 2 + y 2 , 2.vyz, z 2 + a* 2 ] 

4. (.v 2 + y 2 4* z 2 r 3/2 [A, y, z] 

5. [sin A*y, sin Ay, z cos xy] 

6. [My, Mz. *). u 3 (a\ y)] 

7. .y 2 v 2 z 2 [a\ y, z] 

8. Let v = [a\ y, u 3 ]. Find a v 3 such that (a) div v > 0 
everywhere, (b) div v > 0 if \z\ < 1 and div v < 0 if 

w > i- 

9. (Incompressible flow) Show that die flow with 
velocity vector v = yi is incompressible. Show that the 
particles that at time / = 0 are in the cube whose faces 
are portions of the planes x - 0, a* = 1 , y = 0, y = l, 
z = 0, z = 1 occupy at t = 1 the volume 1 . 

10. (Compressible flow) Consider the flow with velocity 
vector v = A*i. Show that the individual particles have 
the position vectors r(t) = c x e l i -r c 2 j + c 3 k with 


constant c lf c* 2 , c 3 . Show that the particles that at t = 0 
are in the cube of Prob. 9 at / = 1 occupy the volume e. 

11. (Rotational flow) The velocity vector v(a\ y, z) of an 
incompressible fluid rotating in a cylindrical vessel is of 
the form v = w x r, where w is the (constant) rotation 
vector; see Example 5 in Sec. 9.3. Show that div v = 0. 
Is this plausible because of our present Example 2? 

12. CAS PROJECT. Visualizing the Divergence. Graph 
the given velocity field v of a fluid flow in a square 
centered at the origin with sides parallel to the coordinate 
axes. Recall that the divergence measures outflow minus 
inflow. By looking at the flow near the sides of the square, 
can you see whether div v must be positive or negative 
or may perhaps be zero? Then calculate div v. First do 
the given flows and dien do some of your own. Enjoy it. 

(a) v = i 

(b) V = A'i 

(c) v = jci - yj 

(d) v = Ai + yj 

(e) v = -Ai - yj 

(0 v = (a 2 + y 2 ) _1 (“.yi + -vj) 
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13. PROJECT. Useful Formulas for the Divergence. Prove 

(a) div (kv) = k div v (k constant) 

(b) div (/v) = / div v + v V/ 

(c) div (/ Vg) = fV 2 g + V/*Vg 

(d) div ( fVg ) - div ( gV/ ) = /V 2 g - gV 2 /. 

Verify (b) for / = and v = axi + by j + czk. 
Obtain the answer to Prob. 4 from (b). Verify (c) for 
/ = a * 2 - v 2 and g = e x + y . Give examples of your own 
for which (a)— (d) are advantageous. 


14-201 CALCULATION OF THE LAPLACIAN BY (3) 

Find V 2 / by (3). Check by differentiation. Indicate when 
(3) is simpler. (Show the details of your work.) 

14. / = xy/z 2 

15. / = (y + a-)/(v - A') 

16. / = z — 4V.v 2 + y 2 17. / = e* 2 ~ y2 cos 2xy 

18. / = arctan (yfx) 19. / = 

20. f = cos 2 a* — sin 2 y 


9.5 Curl of a Vector Field 


Gradient (Sec. 9.7), divergence (Sec. 9.8), and curl are basic in connection with fields, 
and we now define and discuss the curl. 

Let v(a, y, z ) = [v x , v 2 , u 3 ] = yxi + v 2 j + y 3 k be a differentiable vector function of 
die Cartesian coordinates a, y, z. Then the curl of the vector function v or of the vector 
field given by v is defined by the “symbolic” determinant 

* J k 

_a_ d 

dx dy dz 

V 1 v 2 v z 



This is the formula when a, y, z are right-handed. If they are left-handed, the determinant 
has a minus sign in front (just as in (2**) in Sec. 9.3). 

Instead of curl v one also uses the notation rot v (suggested by “rotation”; see Example 2). 


curl v = V x v = 


( 1 ) 


EXAMPLE I Curl of a Vector Function 

Let v = [vz, 3zx\ z] =» yzi + 3z.vj + zk with right-handed a, v\ z. Then (I) gives 

i j k 

curl v = \d/dx dldy 

yz 3 zx 


- -3.vi + yj + (3z — z)k = -3.vi + yj + 2zk. 


The curl plays an important role in many applications. Let us illustrate this with a typical 
basic example. More about the nature and significance of the curl will be said in 
Sec. 10.9. 


EXAMPLE 2 Rotation of a Rigid Body. Relation to the Curl 

We have seen in Example 5, Sec. 9.3, that a rotation of a rigid body B about a fixed axis in space can be 
described by a vector w of magnitude (o in the direction of the axis of rotation, where <o (> 0) is the angular 
speed of the rotation, and w is directed so that the rotation appears clockwise if we look in the direction of w. 
According to (9), Sec. 9.3. the velocity field of the rotation can be represented in the form 


v = w x r 
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THEOREM 1 
THEOREM 2 


PROOF 
EXAMPLE 3 


where r is the position vector of a moving point with respect to a Cartesian coordinate system having the origin 
on the axis of rotation . Let us choose right-handed Cartesian coordinates such that the axis of rotation is the 
z-axis. Then (see Example 2 in Sec. 9.4) 


w = [0, 0, to] - tok, v = w x r = [-toy, to*, 0] = -toyi + to.v j. 


Hence 


curl v = 


i J k 

AAA 

dx dy dz 

— (oy to.v 0 


= [0, 0, 2 a>] = 2wk = 2w. 


This proves the following theorem. 


Rotating Body and Curl 

The curl of the velocity field of a rotating rigid body has the direction of the axis 
of the rotation, and its magnitude equals twice the angular speed of the rotation . 


The following two relations among grad, div, and curl are basic and shed further light on 
the nature of the curl. 


Grad, Div, Curl 

Gradient fields are irrotational That is , if a continuously differentiable vector 
function is the gradient of a scalar function f, then its curl is the zero vector , 

(2) curl (grad f) = 0. 

Furthermore, the divergence of the curl of a twice continuously differentiable vector 
function v is zero , 

(3) div (curl v) = 0. 


Both (2) and (3) follow directly from the definitions by straightforward calculation. In the 
proof of (3) the six terms cancel in pairs. ■ 

Rotational and Irrotational Fields 

The field in Example 2 is not irrotational. A similar velocity field is obtained by stirring tea or coffee in a cup. 
The gravitational field in Theorem 3 of Sec. 9.7 has curl p = 0. It is an irrotational gradient field. H 

The term “irrotational” for curl v = 0 is suggested by the use of the curl for characterizing 
the rotation in a field. If a gradient field occurs elsewhere, not as a velocity field, it is 
usually called conservative (see Sec. 9.7). Relation (3) is plausible because of the 
interpretation of the curl as a rotation and of the divergence as a flux (see Example 2 in 
Sec. 9.8). 

Finally, since the curl is defined in terms of coordinates, we should do what we did for 
the gradient in Sec. 9.7, namely, to find out whether the curl is a vector. This is true, as 
follows. 
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THEOREM 3 


Invariance of the Curl 

curl v is a vector. That is, it has a length and direction that are independent of the 
particular choice of a Cartesian coordinate system in space. (Proof in App. 4.) 



1-6 


CALCULATION OF CURL 


Find curl v for v given with respect to right-handed 
Cartesian coordinates. Show the details of your work. 

1. [y, 2x 1 2 , 0] 

2. [y n , z n , A' n ] (n > 0, integer) 

3. [e x cos y, e x sin y , 0] 

4. (x 2 + y 2 + z 2 )~ 3/2 [x, y, z] 

5. [in {x 2 -I- y 2 ), 2 arctan (y/ x), 0] 

6. [sin y, cos z, —tan x] 


7. What direction does curl v have if v is a vector parallel 
to the „rz-plane? 

8. Prove Theorem 2. Give two examples for (2) and (3) 
each. 


9-14 


FLUID FLOW 


Let v be the velocity vector of a steady fluid flow. Is the 
flow irrotational? Incompressible? Find the streamlines 
(the paths of the particles). Hint. See the answers to Probs. 
9 and 1 1 for a determination of a path. 

9. v = [0, z 2 , 0] 


10. v = [— y 2 . 4, 0] 

11. v = [y, -x, 0] 


12. v = [esc x, sec .v, 0] 


13. v = [x, -y, z] 

14. v = [y 3 4 5 , -x 3 , 0] 

15. WRITING PROJECT. Summary on Grad, Div, 
Curl. List the definition and most important facts and 
formulas for grad, div, curl, and V 2 . Use your list to 
write a corresponding essay of 3^4 pages. Include 
typical examples of your own. 

16. PROJECT. Useful Formulas for the Curl. Assuming 
sufficient differentiability, show that 

(a) curl (u + v) = curl u + curl v 

(b) div (curl v) = 0 

(c) curl (/v) = (grad /) X v + / curl v 

(d) curl (grad /) = 0 

(e) div (u X v) = v*curl u — u*curl v. 

1 17-20 1 EXPRESSIONS INVOLVING THE CURL 

With respect to right-handed coordinates, let 
u = [y 2 , z 2 , x 2 ], v = [yzy zx, xy ], / = xyz , and 
g - x + y + z. Find the following expressions. If one of 
the formulas in Project 16 applies, use it to check your 
result. (Show the details of your work.) 

17. curl v, curl (/v), curl (gv) 

18. curl (/u), curl (gu) 

19. u x curl v, v x curl v, u*curl v, v*curl u 

20. curl (u X v), curl (v x u) 


T^-3T^£ya^W5QrtEE5TIONS AND PROBLEMS 


1. Why did we discuss vectors in R 2 and R 3 in a separate 
chapter, in addition to Chap. 7 on R n l 

2. What are applications that motivate inner products, 
vector products, scalar triple products? 

3. What is wrong with the expression a x b x c? With 
a*b*c? With (a*b) x c? 

4. What are scalar fields? Vector fields? Potentials? Give 
examples. 

5. What is the gradient? How is it related to directional 
derivatives? 


6. Explain “right-handed coordinates,” “orthonormal basis,” 
“tangential acceleration.” 

7. What is the definition of the divergence? Its physical 
meaning? Its relation to the Laplacian? 

8. Granted sufficient differentiability of a scalar function 
f and a vector function v, which of the following make 
sense? grad/, / grad/, v grad/, v • grad /, div/, 
div v, div (/v), curl (/v), curl /, / curl v, v curl /. 

9. If r(0 represents a motion, what is r'(/), |r'(/)|. r"(/), 
|r"»|? 
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10. How do you express the resultant of forces, the moment 
of a force, and the work done by a force in terms of 
vectors? 

11-20 1 VECTOR ADDITION, 

SCALAR MULTIPLICATION, PRODUCTS 

In right-handed coordinates let a = [3, 2, 7], 

b = [6, 5, —4], c = [1, 8, 0], d = [9, -2, 0]. 

Find 

11. 4a + b - c - 2d 

12 . a*b, a*c, a x c 

13. b x b, a x b, b x a 

14. 3a* 4a, 12a* a, 12|a| 2 , |b | 2 

15. 2c x 5d, 10c x d 

16. (a x b)*c, a*(b x c), (a b c) 

17. (a x b) x c, a x (b x c) 

18. (l/|a|)a, (l/|c|)c 

19. (a b d), (d a b) 

20 . ||a| - |b||, |a + b|, |a| + |b| 

21. (Angle) Find the angle between a and b. Between c and 
d. Sketch c and d. 

22. (Angle) Find the angle between the planes 
4x + 3y - z = 2 and x + y + z = 1. 

23. In what case isu x v = v x u? u • v = vu? 

24. (Resultant) Find u such that a, b, c, d above, and u are 
in equilibrium. 

25. (Resultant) Find the most general v such that the resultant 
of a, b, c, d above, and v is parallel to the Ay-plane. 

26. (Work) Find the work done by q = [5, 1, 0] in the 
displacement from (4, 4, 0) to (6, —1, 0). 

27. (Component) Find the component of u = [ - 1 , 5 , 0] 
in the direction of v = [3, 4, 0]. 

28. (Component) In what cases is the component of a in 
the direction of b equal to the component of b in the 
direction of a? 


29. (Component) When is the component of a in the 
direction of b negative? Zero? 

30. (Moment) Find the moment vector m of p = [4, 2, 0] 
about P: (5, 1, 0) if p acts on a line through (1,4, 0). 
Make a sketch. 

31. (Moment) In what cases is the moment of a force p =£ 0 
zero? 

32. (Velocity, acceleration) Find the velocity, speed, and 
acceleration of the motion given by 

r(/) = [5 cos t y sin r, 2t\ 

at the point P: [5/V2, 1/V2, 7 t/ 2 ]. What kind of 

curve is the path? 

33. (Tetrahedron) Find the volume of the tetrahedron with 
vertices (0, 0, 0), (1, 2, 0), (3, -3, 0), (1, l, 5). 

34. (Plane) Find an equation of the plane through (1,0, 2), 
(2, 3, 5), (3, 5, 7). 

35. (Linear dependence) Are [2, -1, 3], [4, 2, -5], 
[—1, 6, 0] linearly dependent? (Give reason.) 

36-45 1 GRAD, DIV, CURL, V 2 , 

DIRECTIONAL DERIVATIVE 

Let / = zy + yx, v = [>>, z, 4 z - x], w = [y 2 , z 2 , x 2 ]. 
Find 

36. grad / and / grad / at (3, 4, 0) 

37. (grad/) X grad/, (grad /)• grad / 

38. div v, div w 

39. curl v, curl w 

40. curl (grad /), div (grad /), div v 

41. V 2 (/), V 2 (/ 2 ) 

42. D„f at (1, 2, 0) 

43. D v f at (3, 7, 5) 

44. div (v x w) 

45. curl (v x w) + curl (\v x v) 



All vectors of the form a = [a Xi a 2 , <* 3 ] = a x i *f *f a s k constitute the real 
vector space 7 ? 3 with componentwise vector addition 

( 1 ) [fli, a 2 , fl 3 ] + [b-i, b 2 , 63 ] = [«i + b v a z + b 2 , a 3 + b 3 ] 
and componentwise scalar multiplication (c a scalar, a real number) 

( 2 ) c[a x , a 2 , a 3 ] = [ca lt ca 2 , ca 3 ] 


(Sec. 9.1). 
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For instance, the resultant of forces a and b is the sum a -I- b. 

The inner product or dot product of two vectors is defined by 

(3) a # b = |a||b| cos y = a l b 1 4- a 2 b 2 + a z b$ (Sec. 9.2) 

where y is the angle between a and b. This gives for the norm or length |a| of a 

(4) |a| = Va*a = V^ 2 + a 2 2 + a 3 2 

as well as a formula for y. If a*b = 0, we call a and b orthogonal. The dot product 
is suggested by the work W = p*d done by a force p in a displacement d. 

The vector product or cross product v = a x b is a vector of length 

(5) |a x b| = |a| |b| sin y (Sec. 9.3) 

and perpendicular to both a and b such that a, b, v form a right-handed triple. In 
terms of components with respect to right-handed coordinates. 


( 6 ) 


i J k 


a x b 


a i a 2 a 3 


b\ b 2 b 2 


(Sec. 9.3). 


The vector product is suggested, for instance, by moments of forces or by rotations. 
CAUTION! This multiplication is anticom mutative, a x b = -b x a, and is not 
associative. 

An (oblique) box with edges a, b, c has volume equal to the absolute value of 
the scalar triple product 

(7) (a b c) = a«(b x c) = (a x b)*c. 

Sections 9.4— 9.9 extend differential calculus to vector functions 


v(r) = OiM. v 2 (t), u 3 W] = ui(?)i + v 2 (t)j + v 3 (t)k 

and to vector functions of more than one variable (see below). The derivative of 
v(0 is 


( 8 ) 


d\ v(/ + A/) - v(t) 

— = lim 

dt At— o A t 


K, v' z , v 3 ] = v[i + v 2 j + v 3 k. 


Differentiation rules are as in calculus. They imply (Sec. 9.4) 

(u»v)' = u'»v + u»v\ (u X v)' = u r X v + u X v / . 

Curves C in space represented by the position vector r(t) have r'(t) as a tangent 
vector (the velocity in mechanics when t is time), r'(.f) (s arc length. Sec. 9.5) as the 
unit tangent vector, and |r"(^)j = k as the curvature (the acceleration in mechanics). 
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Vector functions v(jc, y, z) = [v^x, y, 4), .V* z)> v z( x * y* z)] represent vector 

fields in space. Partial derivatives with respect to the Cartesian coordinates x , y, z 
are obtained componentwise, for instance, 

dv [d Vl dv 2 du 3 “| dVi . dv 2 . dv 3 

^-LiT’ir'i7j = v + i7 J+ ir k (Sec - 9 - 6) - 


The gradient of a scalar function / is 

(9) 

The directional derivative of / in the direction of a vector a is 

( 10 ) 

The divergence of a vector function v is 

( 11 ) 

The curl of v is 


S J J i dx dy dz J 




,. _ dVi dv 

dxv v = V »v = + 

dx 


dy 


dv 3 

dz 


( 12 ) 


curl v = V x v = 


d_ 

dx 

Vl 


j 

_d_ 

dy 

v 2 


d_ 

dz 

Vz 


or minus the determinant if the coordinates are left-handed. 
Some basic formulas for grad, div, curl are (Secs. 9 .1-9. 9) 


(13) 

vc fg) = fig + gv/ 

V(//g) = (l/g 2 )(gV/ - fVg) 

(14) 

div (/v) = / div v + v* V/ 
div (fVg) = fV 2 g + V/*Vg 

(15) 

V 2 / = div (V/) 

V 2 (/g) = gV 2 / + 2 V/*Vg + /V 2 g 

(16) 

curl (/v) = V/ x v + / curl v 
div (u x v) = v • curl u — u • curl v 

(17) 

curl (V/) = 0 


(Sec. 9.7). 


(Sec. 9.7). 


(Sec. 9.8). 


(Sec. 9.9) 


div (curl v) = 0. 

For grad, div, curl, and V 2 in curvilinear coordinates see App. A3.4. 





CHAPTER 1 0 

Vector Integral Calculus. 
Integral Theorems 


This chapter is the companion to Chap. 9. Whereas the previous chapter dealt with 
differentiation in vector calculus, this chapter concerns integration. This vector integral 
calculus extends integrals as known from calculus to integrals over curves (“line 
integrals”), surfaces (“surface integrals”), and solids. We shall see that these integrals have 
basic engineering applications in solid mechanics, in fluid flow, and in heat problems. 

These different kinds of integrals can be transformed into one another. This is done to 
simplify evaluations or to gain useful general formulas, for instance, in potential theory 
(see Sec. 10.8). Such transformations are done by the powerful formulas of Green (line 
integrals into double integrals or conversely, Sec. 10.4), Gauss (surface integrals into triple 
integrals or conversely, Sec. 10.7), and Stokes (line integrals into surface integrals or 
conversely, Sec. 10.9). 

The root of these transformations was largely physical intuition. The corresponding 
formulas involve the divergence and the curl and will thus lead to a deeper physical 
understanding of these two operations. 

Prerequisite: Elementary integral calculus. Secs. 9.1-93 

Sections that may be omitted in a shorter course: 10.3, 10.5, 10.8 

References and Answers to Problems: App. 1 Part B, App. 2 


10.1 Line Integrals 

The concept of a line integral is a simple and natural generalization of a definite integral 
(1) f f(x)dx 

a 


known from calculus. In (1) we integrate the integrand f{x) from x = a along the jc-axis 
to x = b. In a line integral we shall integrate a given function, also called the integrand, 
along a curve C in space (or in the plane). Hence curve integral would be a better name, 
but line integral is standard. 

We represent the curve C by a parametric representation (as in Sec. 9.5) 

(2) •*(/) = [*(/), >■(/), 2 ( 0 ] = x(t)i + v(0j + z(0k (flSfS b). 
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The curve C is called the path of integration, A: r (a) its initial point \ and B: r (b) its 
terminal point. C is now oriented. The direction from A to B , in which t increases, is called 
the positive direction on C and can be marked by an arrow (as in Fig. 217a). The points 
A and B may coincide (as in Fig. 217b). Then C is called a closed path. 

C is called a smooth curve if it has at each point a unique tangent whose direction varies 
continuously as we move along C. Technically: r (t) in (2) is differentiable and the derivative 
r'(f) = drldt is continuous and different from the zero vector at every point of C. 

General Assumption 

In this book, every path of integration of a line integral is assumed to be piecewise smooth; 
that is, it consists of finitely many smooth curves. 

For example, the boundary curve of a square is piecewise smooth, consisting of four 
smooth curves (segments, the four sides). 

Definition and Evaluation of Line Integrals 

A line integral of a vector function F(r) over a curve C: r(r) [as in (2)] is defined by 
(3) f F(r)*dr = [ F(r(t))*r'(t) dt r' = -^ 

J C J a dt 

(see Sec. 9.2 for the dot product). In terms of components, with dr = [< dx , dy , dz ] as 
in Sec. 9.5 and 1 — d/dt , formula (3) becomes 

f F(rWr = f (F a dx + F 2 dy+ F 3 dz) 

J c J c 

( 3 ') 5 

= / (Fix' + F 2 y' + F s z’) dt. 

J a 


If the path of integration C in (3) is a closed curve, then instead of 


i 


we also write 


i 


Note that the integrand in (3) is a scalar, not a vector, because we take the dot product. 
Indeed, F # rV|r'| is the tangential component of F. (For “component” see (11) in Sec. 9.2.) 
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We see that the integral in (3) on the right is a definite integral of a function of t taken 
over the interval a ^ t ^ b on the /-axis in the positive direction (the direction of increasing 
/). This definite integral exists for continuous F and piecewise smooth C, because this 
makes F*r' piecewise continuous. 

Line integrals (3) arise naturally in mechanics, where they give the work done by a 
force F in a displacement along C (details and examples below). We may thus call the 
line integral (3) the work integral. Other forms of the line integral will be discussed later 
in this section. 


EXAMPLE 1 Evaluation of a Line Integral in the Plane 

Find the value of the line integral (3) when F(r) = [— v, -.tv] = -y\ - .vvj and C is the circular arc in 
Fig. 218 from A to B. 



Fig. 218. Example 1 


Solution . We may represent C by r(/) = [cost, sin/] = cos / i + sin t j, where 0 ^ ^ tt/2. Then 

x(t) = cos t , y(t ) = sin t , and 

F(r(/)) = — y(/)i - .v(/).v(/)j = [-sin /, -cos t sin /] = -sin / i — cos / sin / j. 

By differentiation, r'(f) = [—sin t, cos /] = —sin t i + cos t j, so that by (3) [use (10) in App. 3.1; set 
cos t = u in the second term] 


I F(rWr = I [-sin /. -cos / sin /] • [-sin /. cos /] dt = I (sin 2 1 - cos 2 1 sin t ) dt 
J c J o J o 

r /2 1 f° 7T 1 

-I — (1 - cos 2/) dt - I u 2 (-du) = -L — 0 — - ~ 0.4521. 

J q 2 4 3 


EXAMPLE 2 



Fig. 219. Example 2 


Line Integral in Space 

The evaluation of line integrals in space is practically the same as it is in the plane. To see this, find the value 
of (3) when F(r) = [z. .v. y] = zi + .vj 4* yk and C is the helix (Fig. 219) 

(4) r(/) = [cos /, sin /. 3/J = cos / i 4- sin t j 4- 3/k (0 ^ t % 2i r). 

Solution . From (4) we have ,v(/) = cos t, y(t ) = sin t, z{t) — 3 1 . Tlius 

F(r(/))*r'(r) = (3/i 4- cos/j 4- sin/ k)*(-sin ti 4- cos/ j 4- 3k). 

The dot product is 3/(— sin /) 4- cos 2 / 4- 3 sin t. Hence (3) gives 


I F(rWr = I (-3/ sin t 4- cos 2 / 4- 3 sin /) rf/ = 6tt 4- 7 r 4- 0 = 77r » 21.99. M 

J c J o 

Simple general properties of the line integral (3) follow directly from corresponding 
properties of the definite integral in calculus, namely, 


BP 

(5a) 

[ kF'dr = k [ F-dr 
J c J c 

(k constant) 


(5b) 

J(F + G)'dr = J F-dr + f G>dr 
c c c 


Fig. 220. 
Formula (5c) 

(5c) 

f F*dr = f F*dr + f F*dr 

c J c, J c 2 

(Fig. 220) 
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PROOF 


EXAMPLE 3 


where in (5c) the path C is subdivided into two arcs C x and C 2 that have the same 
orientation as C (Fig. 220). In (5b) the orientation of C is the same in all three integrals. 
If the sense of integration along C is reversed, the value of the integral is multiplied by 
— 1. However, we note the following independence if the sense is preserved. 


Direction-Preserving Parametric Transformations 

Any representations of C that give the same positive direction on C also yield the 
same value of the line integral (3). 


A proof follows by the chain rule, where r (t) is the given representation, r = <£(**) with 
a positive derivative dt/dt* is the transformation, with a* ^ t* =i b* corresponding to 
a ^ t ^ b in (3), and we write r(/) = r(<£(f*)) = r *(/*). Then dt = (dtldt*) dt* and 


r r b * dr dt 

I F(r*Wr* = F(r (<f>(r*)))- — ^ dt* 

J c J a* dt dt* 


= / F(r(/))*~ dt = f F(rWr. 

J a dt 


Motivation of the Line Integral (3): 

Work Done by a Force 

The work W done by a constant force F in the displacement along a straight segment d 
is W = F*d; see Example 2 in Sec. 9.2. This suggests that we define the work W done 
by a variable force F in the displacement along a curve C: r(f) as the limit of sums of 
works done in displacements along small chords of C. We show that this definition amounts 
to defining W by the line integral (3). 

For this we choose points t 0 (= a) < t 1 < • • • < t n (= b). Then the work A W m done 
by F(r(r m )) in the straight displacement from r(/ w ) to r(t m+1 ) is 

= F(r(f m ))*[r(f m+1 ) - r(/ m )] = F(r(f TO ))*r'(f m )Af m (A t m = t m+1 - t m ). 

The sum of these n works is W n = A W 0 + • • • -f A W n ^ x . If we choose points and consider 
W n for every n arbitrarily but so that the greatest A t m approaches zero as n — » <» then 
the limit of W n as n—>™ is the line integral (3). This integral exists because of our general 
assumption that F is continuous and C is piecewise smooth; this makes r '(/) continuous, 
except at finitely many points where C may have comers or cusps. ■ 


Work Done by a Variable Force 

If F in Example 1 is a force, the work done by F in the displacement along the quarter-circle is 0.4521, measured 
in suitable units, say, newton-meters (nt-m, also called joules, abbreviation J; see also front cover). Similarly in 
Example 2. ■ 
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EXAMPLE 4 


EXAMPLE 


Work Done Equals the Gain in Kinetic Energy 

Let F be a force, so that (3) is work. Let t be time, so that clr/di = v, velocity. Then we can write (3) as 


( 6 ) 


W 


= f F*dr = f 

■'r J n 


F(r(/)) • v(0 dt. 


Now by Newton’s second law (force = mass X acceleration), 

F = mr"(f) = mv'(/), 

where m is the mass of the body displaced. Substitution into (5) gives [see (11), Sec. 9.4] 


W 




tad 


On the right, /w|v| 2 /2 is the kinetic energy. Hence the work done equals the gain in kinetic energy. This is a 
basic law in mechanics. H 


Other Forms of Line Integrals 

The line integrals 

f F x dx, f F 2 dy, [ F 3 dz 

Jr* Jr* J r* 


(7) 


are special cases of (3) when F = FJ or F 2 j or F 3 k, respectively. 

Furthermore, without taking a dot product as in (3) we can obtain a line integral whose 
value is a vector rather than a scalar, namely, 


( 8 ) 


/ F(r) dt = j F(r(f)) dt = J [F 1 (r(/)), F 2 (r(r», F 3 (r(f))] dt. 


Obviously, a special case of (7) is obtained by taking = /, F 2 = F z = 0. Then 
(8*) / fix) dt = \ mt)) dt 

J C J a 

with C as in (2). The evaluation is similar to that before. 


A Line Integral of the Form (8) 

Integrate F(r) = [ry, yz, z] along the helix in Example 2. 

Solution . F(r(/)) = [cos t sin t, 3/ sin t, 3 /] integrated with respect to t from 0 to 2 tt gives 


J F(r (r)) dt = ^ cos 2 /, 3 sin / - 


- J 2 
3/ cos /, — / 

2 


2 7T 


= [O, — 67T, 6 tt 2 ]. 


Path Dependence 

Path dependence of line integrals is practically and theoretically so important that we 
formulate it as a theorem. And a whole section (Sec. 10.2) will be devoted to conditions 
under which path dependence does not occur. 
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THEOREM 2 


Path Dependence 

The line integral (3) generally depends not only on F and on the endpoints A and 
B of the path , but also on the path itself along which the integral is taken. 


PROOF Almost any example will show this. Take, for instance, the straight segment 
Ci: r x (t) = [f, t, 0] and the parabola C 2 : r 2 (0 = [t, t 2 , 0] with 0 ^ t ^ 1 (Fig. 221) and 
integrate F = [0, xy , 0]. Then F(r x (t))*r[(t) = t 2 , F(r 2 (f))*r 2 (/) = 2 f 4 , so that integration 
gives 1/3 and 2/5, respectively. ■ 



Fig. 221. Proof of Theorem 2 


FFfKBFggg M - s E T -: T B ^~ - : 


M2 


LINE INTEGRAL WORK DONE 
BY A FORCE 


Calculate I F(r)*r/r for the following data. If F is a force, 
J c 

this gives the work done in the displacement along C. 

(Show the details.) 

1. F = [y 3 , x 3 ], C the parabola y = 5x 2 from A: (0, 0) 
to B: (2, 20) 

2. F as in Prob. 1 , C the shortest path from A to B. Is the 
integral smaller? Give reason. 

3. F as in Prob. 1, C from A straight to (2, 0), then 
vertically up to B 

4. F = [. x 2 , y 2 , 0], C the semicircle from (2, 0) to 
(- 2 , 0 ), v ^0 

5. F = [xy 2 , x , 2 y], C: r = [cosh t , sinh f, 0], 

0 ^ ^ 2. Sketch C. 

6 . F = [e x , e y ] clockwise along the circle with center 
( 0 , 0 ) from ( 1 , 0 ) to ( 0 , - 1 ) 

7. F = [z, x, y], C: r = [cos /, sin /, /] from (1, 0, 0) 
to (1, 0, 4 -t r) 

8 . F = [cosh x, sinh y, e z ], C: r = [/, / 2 , / 3 ] from 
( 0 , 0 , 0 ) to ( 5 , 5 , g) 

9. F as in Prob. 8 , C the straight segment from (0, 0, 0) 
to (|, g) 

10. F = [x, -z, 2y] from (0, 0, 0) straight to (1, 1, 0), 
then to (I, 1 , 1 ), back to ( 0 , 0 , 0 ) 


11. F = [e x , e y , e z l r = [f, r 2 , t 2 ) from (0, 0, 0) to 
(2, 4, 4). Sketch C. 

12. F = [y 2 , x 2 , cos 2 z], C as in Prob. 7. Sketch C. 

13. WRITING PROJECT. From Definite Integrals to 
Line Integrals. Write a short report (1-2 pages) with 
examples on line integrals as generalizations of definite 
integrals. The latter give the area under a curve. Explain 
the corresponding geometric interpretation of a line 
integral. 

14. PROJECT. Independence of Representation. 

Dependence on Path. Consider the integral J F(r) • dr, 
where F = [xy, — y 2 ]. c 

(a) One path, several representations. Find the value 

of the integral when r = [cos r, sin /], 0 ^ ^ ir!2. 

Show that the value remains the same if you set t = —p 
or t = p 2 or apply two other parametric transformations 
of your own choice. 

(b) Several paths. Evaluate the integral when 
C: y = x n , thus r = [/, f n ], 0 ^ f ^ 1, where 
n = 1 , 2, 3, • • • . Note that these infinitely many paths 
have the same endpoints. 

(c) Limit. What is the limit in (b) as n — » <»? Can you 
confirm your result by direct integration without 
referring to (b)? 

(d) Show path dependence with a simple example of 
your choice involving two paths. 





426 


CHAP. 10 Vector Integral Calculus. Integral Theorems 


19. (AfL-Inequality, Estimation of Line Integrals) Let F 
be a vector function defined on a curve C. Let |F| be 
bounded, say, |F| ^ M on C, where M is some positive 
number. Show that 

(9) ( F'dr ^ ML (L = Length of C). 

| J c 

20. Using (9), find a bound for the absolute value of the 
work W done by the force F = [a* 2 , y] in the 
displacement along the segment from (0, 0) to (3, 4). 


10 .; Path Independence of Line Integrals 

In this section we consider line integrals 

(1) f F(r)-dr = f (F 1 dx + F 2 dy + F 3 dz) (dr = [dx, dy, dz ]) 

J c J c 

as before, and we shall now find conditions under which (1) is path independent in a 
domain D in space. By definition this means that for every pair of endpoints A, B in D 
the integral (1) has the same value for all paths in D that begin at A and end at B . (See 
Fig. 222. See Sec. 9.6 for “domain.”) 

Path independence is important. For instance, in mechanics it may mean that we have 
to do the same amount of work regardless of the path to the mountaintop, be it short and 
steep or long and gentle. Or it may mean that in releasing an elastic spring we get back 
the work done in expanding it. Not all forces are of this type — think of swimming in a 
big round pool in which the water is rotating as in a whirlpool. 

We shall follow up three ideas that will give path independence of (1) in a domain D 
if and only if: 

(Theorem 1) F = grad / (see Sec. 9.7 for the gradient). 

(Theorem 2) Integration around closed curves C in D always gives 0. 

(Theorem 3) curl F = 0 (provided D is simply connected, as defined below). 

Do you see that these theorems can help in understanding the examples and 
counterexample just mentioned? 

Let us begin our discussion with the following very practical criterion for path 
independence. 


THEOREM 1 




Fig. 222. Path 
independence 


1 15-181 INTEGRALS OF THE FORMS (8) AND (8*) 

Evaluate (8) or (8*) with F or / and C as follows. 

15. / = a * 2 + y 2 , C: r = [/, 4/, 0], 0 ^ t ^ 1 

16. / = l — sinh 2 jr, C the catenary r = [f, cosh f], 
O^t^l 

17. F = [y 2 , z 2 , a 2 ], C the helix 

[3 cos /, 3 sin f, 2/], 0 ^ ^ 87 t 

18. F = [(*y) 1/3 , (y/x) 1/3 , 0], C the hypocycloid 
r = [cos 3 /, sin 3 1, 0], 0 ^ t ^ wf 4 
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PROOF 


EXAMPLE 1 


EXAMPLE 2 


(a) We assume that (2) holds for some function / in D and show that this implies path 
independence. Let C be any path in D from any point A to any point B in D, given by 
r (t) = [x(t), y(r), z(r)], where a^t^b. Then from (2), the chain rule in Sec. 9.6, and 
(3') in the last section we obtain 


J Vi dx + F 2 dy + F 3 dz) = fjQ dx + ^ dy + ^ 

= r b (^fdx + ^f_dy_ + ^£ ffe \ dt 

J a \ dx dt dy dt dz dt ) 


= / ^dt = f[x(t), 3 -(f). z(t)] 


J a dt 


\t=b 


I t=*a 


= f(x(b), y(b\ z(b)) - f(x(a ), y(a), z(a)) 

= m - /(A). 


(b) The more complicated proof of the converse, that path independence implies (2) 
for some /, is given in App. 4. ■ 


The last formula in part (a) of the proof, 


(3) 


( (Fidx + F 2 dy + F 3 dz) = f(B) - f(A) 

J A 


[F = grad/] 


is the analog of the usual formula for definite integrals in calculus, 

b 

= G(b) - G(a) [G'(x) = g(x)]. 


Formula (3) should be applied whenever a line integral is independent of path. 

Potential theory relates to our present discussion if we remember from Sec. 9.7 that / is 
called a potential of F = grad /. Thus the integral (1) is independent of path in D if and 
only if F is the gradient of a potential in D. 

Path Independence 

Show that the integral I F*<Yr = I (2 xdx 4* 2 v dy + 4 zdz) is path independent in any domain in space 
J c J c 

and find its value in the integration from A: (0. 0, 0) to B: (2. 2. 2). 

Solution . F = [2a\ 2y, 4;] = grad /. where / = .v 2 + y 2 -I- 2 z 2 because dffd.x = 2x = df/dy = 2 y = F 2 , 
df/dz = 4 z = F 3 . Hence the integral is independent of path according to Theorem 1, and (3) gives 
f(B) - f(A) = /( 2. 2. 2) - /( 0. 0. 0) = 4 + 4 + 8 = 16. 

If you want to check this, use the most convenient path C: r (/) = [/, r, /], 0 £ / S 2, on which 
F(r(/) = [2 /, 2 4r], so that F(r(/)) •r'(d = 2/ + 2/ + 4/ = 8/. and integration from 0 to 2 gives 8 • 2 2 /2 = 16. 
If you did not see the potential by inspection, use the method in the next example. ■ 

Path Independence. Determination of a Potential 

Evaluate the integral / = j" (lr 2 dx + 2 yzdy + y 2 dz) from A: (0, 1, 2) to 8: (I, -1. 7) by showing that F 
has a potential and applying (3). 
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Solution . If F has a potential /. we should have 

J x = F, = 3.v 2 , f v = F 2 = 2 yz, f z = F s = y 2 . 

We show that we can satisfy these conditions. By integration of f x and differentiation, 

/ = A' 3 + g(y\ z), f y = g y = 2 yz, g = v 2 z + h(z\ f = a * 3 + y 2 z + h(z) 

f z - y 2 + h' = v 2 , h* =0 h = 0, say. 

This gives /(jc, y, z) = a* 3 + y 2 z and by (3), 

/ = /( 1, -1, 7) - /(0, 1, 2) = l + 7 - (0 + 2) = 6. ■ 


Path Independence and Integration 
Around Closed Curves 

The simple idea is that two paths with common endpoints (Fig. 223) make up a single 
closed curve. This gives almost immediately 

THEOREM 2 Path Independence 

The integral (1) is path independent in a domain D if and only if its value around 
every closed path in D is zero. 


PROOF 

C. 



C 2 

Fig. 223. Proof of 


Theorem 2 


If we have path independence, then integration from A to B along Cj and along C 2 in 
Fig. 223 gives the same value. Now C 2 and C 2 together make up a closed curve C, and 
if we integrate from A along C x to B as before, but then in the opposite sense along C 2 
back to A (so that this second integral is multiplied by —1), the sum of the two integrals 
is zero, but this is the integral around the closed curve C. 

Conversely, assume that the integral around any closed path C in D is zero. Given any 
points A and B and any two curves C x and C 2 from A to B in D, we see that C x with the 
orientation reversed and C 2 together form a closed path C. By assumption, the integral 
over C is zero. Hence the integrals over C 1 and C 2> both taken from A to B , must be equal. 
This proves the theorem. ■ 


Work. Conservative and Nonconservative (Dissipative) Physical Systems 
Recall from the last section that in mechanics, the integral (1) gives the work done by a 
force F in the displacement of a body along the curve C. Then Theorem 2 states that work 
is path independent in D if and only if its value is zero for displacement around every 
closed path in D. Furthermore, Theorem 1 tells us that this happens if and only if F is the 
gradient of a potential in D. In this case, F and the vector field defined by F are called 
conservative in D because in this case mechanical energy is conserved; that is, no work 
is done in the displacement from a point A and back to A. Similarly for the displacement 
of an electrical charge (an electron, for instance) in a conservative electrostatic field. 

Physically, the kinetic energy of a body can be interpreted as the ability of the body to 
do work by virtue of its motion, and if the body moves in a conservative field of force, 
after the completion of a round trip the body will return to its initial position with the 
same kinetic energy it had originally. For instance, the gravitational force is conservative; 
if we throw a ball vertically up, it will (if we assume air resistance to be negligible) return 
to our hand with the same kinetic energy it had when it left our hand. 
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Friction, air resistance, and water resistance always act against the direction of motion, 
tending to diminish the total mechanical energy of a system (usually converting it into 
heat or mechanical energy of the surrounding medium, or both), and if in the motion of 
a body these forces are so large that they can no longer be neglected, then the resultant 
F of the forces acting on the body is no longer conservative. Quite generally, a physical 
system is called conservative if all the forces acting in it are conservative; otherwise it 
is called nonconservative or dissipative. 

Path Independence and Exactness 
of Differential Forms 

Theorem 1 relates path independence of the line integral (1) to the gradient and Theorem 
2 to integration around closed curves. A third idea (leading to Theorems 3* and 3, below) 
relates path independence to the exactness of the differential form (or Pfqffian form 1 ) 

(4) F*dr = F x dx 4- F 2 dy + F z dz 

under the integral sign in (1). This form (4) is called exact in a domain D in space if it 
is the differential 

df df df 

df = — dx + — dy + — dz = (grad f)*dr 
dx dy dz 

of a differentiable function /(jc, y y z) everywhere in D, that is, if we have 

F*dr = df. 

Comparing these two formulas, we see that the form (4) is exact if and only if there is a 
differentiable function /(a*, y, z) in D such that everywhere in D, 

df df df 

(5) F = grad /, thus, F 1 = , F 2 = -A , F 3 = -f . 

dx dy dz 

Hence Theorem 1 implies 


THEOREM 3* 


Path Independence 

The integral (1) is path independent in a domain D in space if and only if the 
differential fonn (4) has continuous coefficient functions F x , F 2 , and is exact in D. 


This theorem is practically important because there is a useful exactness criterion. To 
formulate the criterion, we need the following concept, which is of general interest. 

A domain D is called simply connected if every closed curve in D can be continuously 
shrunk to any point in D without leaving D . 

For example, the interior of a sphere or a cube, the interior of a sphere with finitely 
many points removed, and the domain between two concentric spheres are simply 


‘JOHANN FRIEDRICH PFAFF (1765-1825), German mathematician. 
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THEOREM 3 


PROOF 


EXAMPLE 3 


connected, while the interior of a torus (a doughnut; see Fig. 247 in Sec. 10.6) and the 
interior of a cube with one space diagonal removed are not simply connected. 

The criterion for exactness (and path independence by Theorem 3*) is now as follows. 


Criterion for Exactness and Path Independence 

Let F l5 F 2 , F 3 in the line integral (1), 

f F(rWr = f {F 1 dx + F 2 dy + F z dz\ 

J c J c 

be continuous and have continuous first partial derivatives in a domain D in space. 
Then: 

(a) If the differential form (4) is exact in D — and thus (1 ) is path independent 


dF 2 _ dF\ 
dx dy 

(b) If (6) holds in D and D is simply connected, then (4) is exact in D — and 
thus (1) is path independent by Theorem 3*. 


by Theorem 3* — , then in D , 

( 6 ) 

in components (see Sec. 9.9) 

(4r'\ 9F 9 _ 9F 2 

(o ) ~z = — " , 

dy dz 


curl F = 0; 


dz 


dF 3 
dx ’ 


(a) If (4) is exact in D, then F = grad f in D by Theorem 3*, and, furthermore, 
curl F = curl (grad /) = 0 by (2) in Sec. 9.9, so that (6) holds. 

(b) The proof needs “Stokes’s theorem” and will be given in Sec. 10.9. ■ 


Line Integral in the Plane. Fori F(r)»dr=| {F x dx + F 2 dy) the curl has only one 

J c J c 

component (the z-component), so that (6') reduces to the single relation 


( 6 ") 


d F 2 _ dF\ 
dx dy 


(which also occurs in (5) of Sec. 1 .4 on exact ODEs). 

Exactness and Independence of Path. Determination of a Potential 

Using (6'). show that the differential form under the integral sign of 

/ = fj 2 *yz 2 dx + (vV + z cos yz) dy + (2x 2 yz + v cos yz) dz\ 

is exact, so that we have independence of path in any domain, and find the value of 1 from A: (0 0 11 
to B: ( 1 , 77 / 4 , 2). 
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EXAMPLE 


Solution. Exactness follows from (6'), which gives 

(F 3 ) y — + COS yz - yz sin yz = ( F 2 ) z 

(F 1 ) z = 4xyz = (F 3 ) x 
(F 2 ) x = 2xz 2 = (Fi) y . 

To find /, we integrate F 2 (which is “long,” so that we save work) and then differentiate to compare with F x 
and F 3 , 

/ = Jf 2 dy = J (x 2 z 2 + z cos yz) dy = x 2 z 2 y + sinyz + z) 
fx = 2 *z 2 y + g x = Fi= 2 xyz 2 , g x = 0, g = h(z) 
f z = 2x\y + y cos yz + h' = F z = 2x 2 zy + y cos yz, h ' = 0. 
h' = 0 implies h = const and we can take h = 0, so that g = 0 in the first line. This gives, by (3), 

f(x, y, z) = x 2 yz? + sin yz, f(B) - /(A) = l- -j-4 + siny-0=7r+l. ■ 


The assumption in Theorem 3 that D is simply connected is essential and cannot be omitted. 
Perhaps the simplest example to see this is the following. 


On the Assumption of Simple Connectedness in Theorem 3 


Let 

(7) 




y 

2 , 2 » 
x 4 - y 


Fz 


x 

~ x 2 y 2 


F z = 0. 


Differentiation shows tha t (6^) is s atisfied in any domain of the xy-plane not containing the origin, for example, 

in the domain D: | < < | shown in Fig. 224. Indeed, F 1 and F 2 do not depend on z, and F 3 = 0, 

so that the first two relations in (6') are trivially true, and the third is verified by differentiation: 

BF 2 _ x 2 + y 2 - x-2x _ y 2 — x 2 
Bx (x 2 + y 2 ) 2 ~ (x z + y 2 ) 2 * 

dFi _ a : 2 + y 2 - y • 2y _ y 2 - jc 2 

By (x 2 + y 2 ) 2 (x 2 + y 2 ) 2 ' 


Clearly, D in Fig. 224 is not simply connected. If the integral 


'-l 


/= (Fjdx + 



—ydx + xdy 

2 , 2 
x + y 


were independent of path in D, then / = 0 on any closed curve in D, for example, on the circle jt 2 •+ y 2 = 1. 
But setting x = r cos 0, y = r sin 0 and noting that the circle is represented by r = 1, we have 

x = cos 0, = -sin 0<f0, y = sin 0, dy = cos 0d0, 


so that — y + jc dy = sin 2 ddQ + cos 2 QdQ = d$ and counterclockwise integration gives 



Since D is not simply connected, we cannot apply Theorem 3 and cannot conclude that I is independent of path 
in D. 

Although F = grad /, where / = arctan (y/x) (verify!), we cannot apply Theorem 1 either because the polar 
angle / « $ = arctan ( y/x ) is not single- valued, as it is required for a function in calculus. ■ 
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EPR:OEBI££M^S?ET=r5 


[Ml path-independent integrals 

Show that the form under the integral sign is exact in the 
plane (Probs. 1-4) or in space (Probs. 5-8) and evaluate 
the integral. (Show the details of your work.) 


,( 4 . 77 / 8 ) 


1. I (y cos xy dx 4 x cos xy dy) 

J (0.0 ) 

-( 0 . 5 ) 

2. I ( y 2 e 2x dx 4 ye 2x dy) 

J (.5,0) 

-( 1 . 1 ) 

3. I e~ x2 ~ y2 (x dx 4 y dy) 

J (-i,-i > 

-(6,7 r) 

4. I (cos 2 y — 2 a: cos y sin y </y) 

J (2,0) 

-( 0 , 1 , 2 ) 

5. I (z (/a* 4 r/y 4 xe xz dz) 

J ( 2 . 3 , 0 ) 

-( 1 . 1 , 0 ) 

6. J e X 2 +'J 2 -2z ( X dx + y rfv - rfr) 

-'( 0 , 0 , 0 ) 

-( 7 , 8 , 0 ) 

7. I (2A*y (/a 4 a 2 dy 4 sinh z dz) 

-'( 1 , 0 . 0 ) 

J , ( 4 . 4 , 0 ) 

[2A(y 3 — z 3 ) (/a 4 3a 2 v 2 dy — 3 a 2 z 2 dz] 


17 ( 2 . 0 , 1 ) 


9. Show that in Example 4 of the text we have 

F = grad (arctan (y/A)). Give examples of domains in 
which the integral is path independent. 


(c) Integrate from (0, 0) along the straight-line 
segment to (c, 1), 0 = c ^ 1, and then horizontally to 
(1, 1). For c = l, do you get the same value as for 
b = 1 in (b)? For which c is I maximum? What is its 
maximum value? 



1 1 1-19 1 CHECK FOR PATH INDEPENDENCE 

and, if independent, integrate from (0, 0, 0) to ( a , b , c). 

11. (coshAz)U dx 4 a dz) 

12. (3aV* 4 a) dx 4 2aV" dy 

13. 3a 2 v dx 4 a 3 dy 4 y dz 

14. 2a sin y dx 4 a 2 cos y dy 4 y 2 dz 

15. ( ze x — e y ) dx - xe y dy 4 e x dz 

16. e x cos 2y dx — 2e x sin 2y dy — xz dz 

17. xy z 2 dx 4 \x 2 z 2 dy 4 a 2 vz dz 

18. yz cosh x dx 4 z sinh x dy 4 y sinh a dz 

19. y dx 4 (a — 2y) dy 4 4a dz 


10. PROJECT. Path Dependence, (a) Show that 

I — I (a 2 v dx 4 2 xy 2 dy) is path dependent in the 
J c 

Ay-plane. 

(b) Integrate from (0, 0) along the straight-line 
segment to (1, b\ 0 ^ b ^ K and then vertically up to 
(1, 1); see the figure. For which of these paths is / 
maximum? What is its maximum value? 


20. WRITING PROJECT. Ideas on Path Independence. 
Make a list of the main ideas on path independence 
and dependence in this section. Then work this list into 
an essay, including explanations of all definitions and 
on the practical usefulness of the theorems, but no 
proofs. Include illustrating examples of your own. 
Explain what happens in Example 4 if you take the 
domain 0 < Va 2 4 y 2 < §. 
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10 .: Calculus Review: Double Integrals. 

Optional 

Students familiar with double integrals from calculus should go on to the next 
section , skipping the present review, which is included to make the book reasonably 
self-contained. 

In a definite integral (1), Sec. 10.1, we integrate a function f(x) over an interval (a 
segment) of the *-axis. In a double integral we integrate a function f(x> y), called the 
integrand , over a closed bounded region 2 R in the xy-plane, whose boundary curve has a 
unique tangent at each point, but may perhaps have finitely many cusps (such as the 
vertices of a triangle or rectangle). 

The definition of the double integral is quite similar to that of the definite integral. 
We subdivide the region R by drawing parallels to the jc- and y-axes (Fig. 225). We 
number the rectangles that are entirely within R from 1 to n. In each such rectangle we 
choose a point, say, ( x k , y k ) in the &th rectangle, whose area we denote by A A k . Then 
we form the sum 


n 

4 = 2 /(**. y*) ^A k . 


fc= 1 


This we do for larger and larger positive integers n in a completely independent manner, 
but so that the length of the maximum diagonal of the rectangles approaches zero as n 
approaches infinity. In this fashion we obtain a sequence of real numbers • * • . 

Assuming that f(x, y) is continuous in R and R is bounded by finitely many smooth 
curves (see Sec. 10.1), one can show (see Ref. [GR4] in App. 1) that this sequence 
converges and its limit is independent of the choice of subdivisions and corresponding 
points (x k , y k ). This limit is called the double integral of f(x, y) over the region R, and 
is denoted by 


J J f(x, y) dx dy or J J f(x, y ) dA . 

R R 



Fig. 225. Subdivision of a region R 


2 A region R is a domain (Sec. 9.6) plus, perhaps, some or all of its boundary points. R is closed if its boundary 
(all its boundary points) are regarded as belonging to R; and R is bounded if it can be enclosed in a circle of 
sufficiently large radius. A boundary point P of R is a point (of R or not) such that every disk with center P 
contains points of R and also points not of R . 
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Double integrals have properties quite similar to those of definite integrals. Indeed, for 
any functions / and g of (x, y), defined and continuous in a region R> 


j j kfdxdy = kjjf dxdy 

R R 


( k constant) 


( 1 ) 


/ j(f + g)dxdy = J Jf dxdy + J Jgdxdy 


f Jfdxdy = j jf dxdy + J jfdxdy (Fig. 226). 

R Hj Jf?2 

Furthermore, if R is simply connected (see Sec. 10.2), then there exists at least one point 
(jc 0 , yo) in R such that we have 


(2) Jjf(x,y)dxdy = f(x 0 ,y 0 )A, 

R 

where A is the area of R. This is called the mean value theorem for double integrals. 



Fig. 226. Formula (1) 


Evaluation of Double Integrals 
by Two Successive Integrations 

Double integrals over a region R may be evaluated by two successive integrations . We 
may integrate first over y and then over *. Then the formula is 


(3) 


J j fix, y) dxdy = 

R 


J) ” Mxf) 

J J * 

J a g(.x) 


f(x,y)dy 


dx 


(Fig. 227). 


Here y = g(x) and y = h(x) represent the boundary curve of R (see Fig. 227) and, keeping 
x constant, we integrate /(*, y) over y from g(x) to h{ x). The result is a function of x , and 
we integrate it from x = a to x = b (Fig. 227). 

Similarly, for integrating first over x and then overy the formula is 


J Jf(x,y)dxdy = 

R 



dy 


(4) 


(Fig. 228). 
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Fig. 227. Evaluation of a double integral Fig. 228. Evaluation of a double integral 


The boundary curve of R is now represented by x = p{y) and x = q{y). Treating y as a 
constant, we first integrate /( x, y) over x from piy) to q(y) (see Fig. 228) and then the 
resulting function of y from y = c to y = d. 

In (3) we assumed that R can be given by inequalities a^x^b and g(x) ^ y ^ h(x ). 
Similarly in (4) by c ^ y ^ d and piy) ^kx^k q(y). If a region R has no such representation, 
then in any practical case it will at least be possible to subdivide R into finitely many 
portions each of which can be given by those inequalities. Then we integrate f(x , y) over 
each portion and take the sum of the results. This will give the value of the integral of 
f(x, y) over the entire region R. 

Applications of Double Integrals 

Double integrals have various physical and geometric applications. For instance, the area 
A of a region R in the -vy-plane is given by the double integral 


A = J Jdxdy. 

R 

The volume V beneath the surface z = fix, y) (> 0) and above a region R in the xy-plane 
is (Fig. 229) 

v = J Jf(x, y) dx dy 

R 

because the term f(x k , y k ) A A k in J n at the beginning of this section represents the volume 
of a rectangular box with base of area A A k and altitude f(x k , y k ). 



Fig. 229. Double integral as volume 
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As another application, let f(x, y) be the density (= mass per unit area) of a distribution 
of mass in the xy-plane. Then the total mass M in R is 

M = j jf(x,y)dxdy; 

R 

the center of gravity of the mass in R has the coordinates x, y, where 

x = / J xf(x, y) dx dy and y = J J yf(x, y) dx dy; 

R R 

the moments of inertia l x and l y of the mass in R about the jc- and y-axes, respectively, are 
/*=//; y 2 f(x, y) dx dy, I y = J j x 2 f(x, y) dx dy; 

R R 

and the polar moment of inertia / 0 about the origin of the mass in R is 
I 0 = I x + I y = j j(x 2 + y z )f(x, y) dx dy. 

R 

An example is given below. 


Change of Variables in Double Integrals. Jacobian 

Practical problems often require a change of the variables of integration in double integrals. 
Recall from calculus that for a definite integral the formula for the change from x to u is 

r b dx 

(5) J fix) dx = I f(x(u)) — du. 

J a J a du 

Here we assume that x = x{u) is continuous and has a continuous derivative in some 
interval a ^ u ^ such that *(a) = a, x(f3) = b [or x(a) = b , a(/3) = a] and x(u) varies 
between a and b when u varies between a and /3. 

The formula for a change of variables in double integrals from jc, y to w, v is 


( 6 ) 


y) dx dy = J Jf(x(u, v), y(u , v)) 

R R* 


d(s. y) 

d(u, v) 


du dv. 


that is, the integrand is expressed in terms of u and v, and dx dy is replaced by du dv times 
the absolute value of the Jacobian 3 


( 7 ) 


d(x, y) 
d(u, v ) 


dx 

dx 





du 

dv 

dx 

dy_ _ 

dx 

dy 

du 

dy_ 

dv 

du 

dv 

dv 

du 


3 Named after the German mathematician CARL GUSTAV JACOB JACOBI (1804-1851), known for his 
contributions to elliptic functions, partial differential equations, and mechanics. 
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EXAMPLE 1 


Here we assume the following. The functions 

x = x(u u\ y = y(it , u ) 

effecting the change are continuous and have continuous partial derivatives in some region 
R* in the wu-plane such that for every (w, u) in R* the corresponding point (x, y) lies in 
R and, conversely, to every (x, y) in R there corresponds one and only one (m, v ) in /?*; 
furthermore, the Jacobian J is either positive throughout R* or negative throughout R *. 
For a proof, see Ref. [GR4] in App. 1. 

Change of Variables in a Double Integral 

Evaluate the following double integral over the square R in Fig. 230. 

J j(x 2 + y 2 )dxdy 
R 

Solution. The shape of R suggests the transformation x + y = u, x — y = v. Then x = | (u 4- u), 
y = \(u - u). The Jacobian is 

J _ d( x > y) _ 5 5 

~ «(«,») “ I —I 

J? corresponds to the square 0 ^ u ^ 2, 0 ^ t; ^ 2. Therefore, 

2 2 

/ / (-V 2 + >- 2 ) dv dy = (u 2 + v 2 )^d',dv = j. m 




Of particular practical interest are polar coordinates /• and which can be introduced 
by setting x — r cos 0, y = /* sin 0. Then 


and 


gte y) 

0(r, 0) 


cos 0 
sin 0 


—/* sin 0 
r cos 0 




( 8 ) 


/ //(■*> y)dxdy = J Jf(r cos 0, rsin 0) rdrdd 

R R* 


where is the region in the r0-plane corresponding to R in the xy-planc. 
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EXAMPLE 2 Double Integrals in Polar Coordinates. Center of Gravity. Moments of Inertia 

Let /(*, y) = 1 be the mass density in the region in Fig. 231. Find the total mass, the center of gravity, and the 
moments of inertia I x , I y , / 0 . 

Solution, We use the polar coordinates just defined and formula (S). This gives the total mass 



Fig. 231. 
Example 2 


M = JJdxdy = J J r dr dd = J - dd = f . 


The center of gravity has the coordinates 


..±fj 

7 r J n 


r cos 0 r dr dO 


_ ± r /2 i 

7 r Jn 3 


cos BdO = — = 0.4244 
3 77 


y — — — for reasons of symmetry. 

3 77 


The moments of inertia arc 


l x — J J y 2 dx dy = J J r 2 sin 2 d r dr dd = J ■— sin 2 B dO 

f^ 2 1 1 ( 7T \ 7T 

= J 0 ? (1 -cos2 ©^= i ( T -0 ) = - =0.1963 


Iy = — for reasons of symmetry. 


/o = ix + /y « y = 0.3927. 


Why are Jc and y less than 5 ? 


This is the end of our review on double integrals. These integrals will be needed in this 
chapter, beginning in the next section. 


P.RiO.B LEM SET 103 


1. (Mean value theorem) Illustrate (2) with an example. 

2^1 DOUBLE INTEGRALS 

Describe the region of integration and evaluate. (Show the 
details.) 


f 1 f 2 * 

2. II (x H- y) 2 dy dx 

J o J x 

3. f [ (1 - 2xy) dy dx 
J o J * 2 

4. As Prob. 3, order reversed 

5. I I cosh (x + y) dx dv 
J o J o 

6. As Prob. 5, order reversed 

e* +2y dy dx 


A A-** 


8. f f x 2 y dy dx 

J 0 J l-x 

n sin y 

e x cos y dx dy 

\ 


10. Integrate xye? v over the triangular region with 
vertices (0, 0), (1, 1), (1, 2). 

11-131 VOLUME 

Find the volume of the following regions in space. 

11. The region beneath z = x 2 + y 2 and above the square 
with vertices (1, 1), (-1, 1), (-1, -1), (1, -1) 

12. The tetrahedron cut from the first octant by the plane 
|r + 2y + z = 6. Check by vector methods. 

13. The first octant section cut from the region inside the 
cylinder a- 2 + z 2 = 1 by the planes y = 0, z = 0, jc = y. 
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1 14-16 1 CENTER OF GRAVITY 

Find the center of gravity (.v. y) of a mass of density 
/(*, y) = 1 in the given region /?. 

14. R the semidisk x 2 + y 2 £ a 2 , y £ 0 



17-20 


MOMENTS OF INERTIA 


Find the moments of inertia / x , I y , 7 0 of a mass of density 
fix, y) = 1 in the region R shown in the figures (which the 


engineer is likely to need, along with other profiles listed 
in engineering handbooks). 

17. R as in Prob. 15. 18. R as in Prob. 16. 




10.4 Green's Theorem in the Plane 

Double integrals over a plane region may be transformed into line integrals over the 
boundary of the region and conversely. This is of practical interest because it may simplify 
the evaluation of an integral. It also helps in the theory whenever we want to switch from 
one kind of integral to the other. The transformation can be done by the following theorem. 


THEOREM 1 


Green’s Theorem in the Plane 4 

(Transformation between Double Integrals and Line Integrals) 

Let R be a closed bounded region (see Sec. 10.3) in the xy-plane whose boundary 
C consists of finitely many smooth cun>es (see Sec. 1 0.1). Let F x ( a\ y) and F 2 (x, y) 
be functions that are continuous and have continuous partial derivatives dF x /c)y and 
dF 2 /Bx everywhere in some domain containing R. Then 

<» SJ (it - 17) ** - £ <f ' * + *>• 

Here we integrate along the entire boundary C of R in such a sense that R is on 
the left as we advance in the direction of integration (see Fig. 232 on p. 440). 


4 GE0RGE GREEN (1793-1841). English mathematician who was self-educated, started out as a baker, and 
at his death was fellow of Caius College. Cambridge. His work concerned potential theory in connection with 
electricity and magnetism, vibrations, waves, and elasticity theory. It remained almost unknown, even in England, 
until after his death. 

A “domain containing R ” in the theorem guarantees that the assumptions about F x and F 2 at boundary points 
of R are the same as at other points of R. 
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EXAMPLE 1 


PROOF 



Fig. 232. Region R whose boundary C consists of two parts: 
C, is traversed counterclockwise, while C 2 is traversed 
clockwise in such a way that R is on the left for both curves 


Setting F = [F l5 F 2 ] = F x \ + F 2 j and using (1) in Sec. 9.9, we obtain (1) in vectorial 
form, 


a') 


J f (curl F)*k dxdy = F*dr. 


The proof follows after the first example. For $ see Sec. 10.1. 

Verification of Green’s Theorem in the Plane 

Green’s theorem in the plane will be quite important in our further work. Before proving it. let us get used to 
it by verifying it for F 1 = y 2 - 7y, F 2 = 2vy + 2x and C the circle .v 2 -I- y 2 = 1. 

Solution . In (1) on the left we get 

1 1 (l7 " 17 ) dxdy = / 1 l(2y + 2) ~ (2 - v ~ 7 ) i rfvrf >' = 9 J f drt/ y = 9,7 

since the circular disk R has area 7 r. 

We now show that the line integral in (1) on the right gives the same value, 9 tt. We must orient C 
counterclockwise, say, r(/) = [cos /, sin /]. Then r'(r) = [-sin cos r], and on C, 

Fi — y 2 — ly = sin 2 1 — 7 sin r, F 2 = 2xy + 2 a* = 2 cos / sin / + 2 cos /. 

Hence the line integral in (1) becomes, verifying Green’s theorem, 

<P (F t x* + F z y f ) dt = I [(sin 2 1 - 7 sin r)(— sin t ) + 2(cos t sin t + cos f)(cos t )] dt 

J c J o 

r 27r 

= I (-sb 3 1 + 7 sin 2 / + 2 cos 2 / sin t + 2 cos 2 /) dt 

J r\ 


= 0 + 777 —0 + 277= 977. I 

We prove Green’s theorem in the plane, first for a special region R that can be represented 
in both forms 


and 


a ^ x ^ b, u(x) ^ y ^ v(x) 

c = J = 4 p{y) S x S ^0') 


(Fig. 233) 
(Fig. 234). 
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Fig. 233. Example of a special region 



Using (3) in the last section, we obtain for the second term on the left side of ( 1 ) taken without 
the minus sign 


( 2 ) 


r r dF r b ~ ayr 

I I — - dx dy = I I — - dy dx (see Fig. 233). 

V & * a |A<*> dy J 


(The first term will be considered later.) We integrate the inner integral: 


r v(x) % J7 

I 17 dy = Fl(x> y) 

J u(.t) oy 


y=v (.r) 


y~u(.x) 


= F ± [x, v(x)] - F x [x, u(x)]. 


By inserting this into (2) we find (changing a direction of integration) 


f f — 1 dxdy = \ F x [ a\ v(x)] dx - f F a [ m(jc)] 

J R J dy J a J a 

= — J F x [x 9 u(a)] dx — J Frfx, u(x)] dx. 


Since y = v(x) represents the curve C** (Fig. 233) and y = u(x) represents C*, the last 
two integrals may be written as line integrals over C ** and C* (oriented as in Fig. 233); 
therefore, 


(3) 


[ f — 1 dxdy = - [ Fife y) ^ [ Fxfo y) 

dy J c** J c* 

= F x (x, 3 O dx. 


dx 


This proves ( 1 ) in Green’s theorem if F 2 = 0. 

The result remains valid if C has portions parallel to the y-axis (such as C and C in 
Fig. 235). Indeed, the integrals over these portions are zero because in (3) on the right we 
integrate with respect to x. Hence we may add these integrals to the integrals over C* and 
C** to obtain the integral over the whole boundary C in ( 3 ). 

We now treat the first term in (1) on the left in the same way. Instead of (3) in the last 
section we use (4), and the second representation of the special region (see Fig. 234). 
Then (again changing a direction of integration) 



442 


CHAP. 10 Vector Integral Calculus. Integral Theorems 


EXAMPLE 2 


R dx J c L J pc») dx 

= J ^2(9(y). y) dy + f F z (p(y), y) dy 

J c J d 

= $ F z (x, y) dy. 



Together with (3) this gives (1) and proves Green’s theorem for special regions. 

We now prove the theorem for a region R that itself is not a special region but can be 
subdivided into finitely many special regions (Fig. 236). In this case we apply the theorem 
to each subregion and then add the results; the left-hand members add up to the integral 
over R while the right-hand members add up to the line integral over C plus integrals over 
the curves introduced for subdividing R . The simple key observation now is that each of 
the latter integrals occurs twice, taken once in each direction. Hence they cancel each 
other, leaving us with the line integral over C. 

The proof thus far covers all regions that are of interest in practical problems. To prove 
the theorem for a most general region R satisfying the conditions in the theorem, we must 
approximate R by a region of the type just considered and then use a limiting process. 
For details of this see Ref. [GR4] in App. 1. ■ 

Some Applications of Green's Theorem 


Area of a Plane Region as a Line Integral Over the Boundary 

In (1) we first choose Fi = 0, F 2 = x and then F 1 — — y, F 2 = 0. This gives 


J J dxdy = .v dy and j J dxdy = ydx 


respectively. The double integral is the area A of R. By addition we have 
(4) A = ^ <fi (a* dy - y dx) 

Z J r* 


where we integrate as indicated in Green’s theorem. This interesting formula expresses the area of R in terms 
of a line integral over the boundary. It is used, for instance, in the theoiy of certain planimeters (mechanical 
instruments for measuring area). See also Prob. 17. 
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EXAMPLE 3 


EXAMPLE 4 


For an ellipse x 2 /a 2 + y 2 fb 2 = 1 or a* = a cos /, y = b sin t we get x = - a sin t, y f = b cos /; thus from 
(4) we obtain the familiar formula for the area of the region bounded by an ellipse. 

1 c 27r 1 c 2 * 

A = — I (xy — yx')dt = — I [n/?cos 2 f — (— ab sin 2 r)] d/ = irab. I 

2 2 Jrx 


Area of a Plane Region in Polar Coordinates 

Let r and 0 be polar coordinates defined by a : — r cos 0, y = r sin 6. Then 

dx = cos Odr - r sin 0d0, dy = sin 0dr + rcos 0d0, 
and (4) becomes a formula that is well known from calculus, namely. 


(5) 


-it 


r z d$. 


As an application of (5), we consider the cardioid r = o(l - cos 0), where 0 ^ 0 ^ 2 tt (Fig. 237). We find 


,2 r 27T 


= -f 

2 J r 


(1 - cos efdd= — a*. 
0 2 


37T 


Transformation of a Double Integral of the Laplacian of a Function 
into a Line Integral of Its Normal Derivative 

The Laplacian plays an important role in physics and engineering. A first impression of this was obtained in 
Sec. 9.7, and we shall discuss this further in Chap. 12. At present, let us use Green's theorem for deriving a 
basic integral formula involving the Laplacian. 

We take a function w(.v, y) that is continuous and has continuous first and second partial derivatives in a 
domain of the .ry-plane containing a region R of the type indicated in Green’s theorem. We set F x = —Bw/By 
and F 2 = Bw/Bx. Then BF-jBy and BF 2 /Bx are continuous in /?, and in (1) on the left we obtain 


( 6 ) 


BF 2 or i a »v o w 0 

T — = — 3T + — T = V 2 w, 


9F, 


dx dy dx* By* 

the Laplacian of w (see Sec. 9.7). Furthermore, using those expressions for Fj and F 2 , we get in (1) on the right 

f f / dx dy\ f ( Bw dx dy \ 

( 7 ) £<** + * 4 ) = 

where s is the arc length of C, and C is oriented as shown in Fig. 238. The integrand of the last integral may 
be written as the dot product 


( 8 ) 


f dw c)h> ~] f dy dx 1 Bw dy Bw dx 

(erad w) • n = — , — • — , = — — — — — . 

; l dx ’ By J L ds da J dx ds By da 


y 

( \ 

\ 


X 

\* . 

J 

^ 




Fig. 237. Cardioid 


Fig. 238. Example 4 
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The vector n is a unit normal vector to C, because the vector r'ts) = dvlds = [ dxlds , dylds] is the unit 
tangent vector of C, and r ; »n = 0, so that n is perpendicular to r ; . Also* n is directed to the exterior of C 
because in Fig. 238 the positive .v-component dxtds of r' is tlie negative v-component of n, and similarly at 
other points. From this and (4) in Sec. 9.7 we see that the left side of (8) is the derivative of w in the direction 
of the outward normal of C. This derivative is called the normal derivative of w and is denoted by dw/dn; 
that is, dwldn = (grad u*)*n. Because of (6). (7), and (8). Green's theorem gives the desired formula relating 
the Laplacian to the normal derivative, 


(9) 


f f V 2 w dx dy = <f — ds. 
V J C dn 


For instance, ir = .v 2 — .v 2 satisfies Laplace’s equation V 2 iv = 0. Hence its normal derivative integrated over 
a closed curve must give 0. Can you verify this directly by integration, say, for the square 0 ^ ^ 1, 

O^y^l? W ■ 

Green’s theorem in the plane may facilitate the evaluation of integrals and can be used in 
both directions, depending on the kind of integral that is simpler in a concrete case. This 
is illustrated further in the problem set. Moreover, and perhaps more fundamentally. 
Green's theorem will be the essential tool in the proof of a very important integral theorem, 
namely, Stokes's theorem in Sec. 10.9. 


PROBLEM SET 10.4 


1 1—12 1 EVALUATION OF LINE INTEGRALS 
BY GREEN’S THEOREM 

Using Green's theorem, evaluate I F(r) • dr counterclockwise 

J c 

around the boundary curve C of the region R, where 

1. F = [|jcy 4 , R the rectangle with vertices (0, 0), 

(3, 0), (3, 2), (0, 2) 

2. F = [y sin a*, 2a cos _y], R the square with vertices 
(0, 0), (|t7, 0), \tt % (0, t) 

3. F = [-y 3 , a 3 ], C the circle a 2 + y 2 = 25 

4. F = [ -e y , e x ], R the triangle with vertices (0, 0), 

(2. 0). (2. 1) 

5. F = [ e x+y , e x ~ v ], R the triangle with vertices (0, 0), 
(1, 1X0,2) 

6. F = [a cosh y, a* 2 sinh y], R: a 2 ^ y ^ a. Sketch R. 

7. F = [a 2 + y 2 , a* 2 - y 2 ], R: 1 ^ y ^ 2 - a 2 . Sketch 

R. 

8. F = [e x cosy, — e x sin y], R the semidisk 
a 2 + y 2 ^ a 2 , a* = 0 

9. F = grad (a 3 cos 2 (Ay)), R the region in Prob. 7 

10. F = [x In y. ye x ], R the rectangle with vertices (0, 1 X 
(3, 1), (3, 2), (0, 2) 

11. F = [Zx - 3y, a + 5y], R: 16a* 2 + 25y 2 ^ 400, y^0 

12. F = [a‘V, — A*/y 2 ], R: 1 ^ a 2 + y 2 ^ 4, x ^ 0, 

y = a. Sketch R. 


13-16 


INTEGRAL OF THE NORMAL DERIVATIVE 
ds counterclockwise over the 


Using (9), evaluate 

J c dn 

boundary curve C of the region R. 


4 — , 

Jr dn 


13. vv = sinh a, R the triangle with vertices (0, 0), (2, 0), 

( 2 , 1 ) 

14. vv = a 2 + y 2 , C: a 2 + y 2 = l. Confirm the answer by 
direct integration. 

15. vv = 2 In (a 2 + y 2 ) + av 3 , R: 1 ^ y S 5 - a 2 a ^ 0 

16. vv — a 6 y 4- A*y 6 , R: a 2 + y 2 ^ 4. y i 0 


17. CAS EXPERIMENT. Apply (4) to figures of your 
choice whose area can also be obtained by another 
method and compare the results. 

18. (Laplace’s equation) Show that for a solution vv(a, y) 
of Laplace’s equation V 2 w — 0 in a region R with 
boundary curve C and outer unit normal vector n, 


( 10 ) 


1 dvv 

= <p vv ds. 

Jr dn 


19. Show that w = 2e x cos y satisfies Laplace’s equation 
V 2 w = 0 and, using (10), integrate w(dw!dn) 
counterclockwise around the boundary curve C of the 
square 0 ^ x ^ 2, 0 ^ y ^ 2. 
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20. PROJECT. Other Forms of Green’s Theorem in 
the Plane. Let R and C be as in Green’s theorem, r' 
a unit tangent vector, and n the outer unit normal vector 
of C (Fig. 238 in Example 4). Show that (1) may be 
written 


J J div F dx dy = < > F • n ds 



where k is a unit vector perpendicular to the jcy-plane. 
Verify (11) and ( 1 2) for F = [7jc, —3 y] and C the circle 
x 2 4- y 2 = 4 as well as for an example of your own 
choice. 


10 . Surfaces for Surface Integrals 

Having introduced double integrals over regions in the plane, we turn next to surface 
integrals, in which we integrate over surfaces in space, such as a sphere or a portion of a 
cylinder. For this we must first see how to represent a surface. And we must discuss 
surface normals, since they are also needed in surface integrals. For simplicity we shall 
say “surface” also for a portion of a surface. 

Representation of Surfaces 

Representations of a surface S in Ayz-space are 

(1) z = f(x, y) or g(x f y, z) = 0. 

For example, z = 4-Va 2 - x 2 ~ y 2 or x 2 + y 2 + z 2 - a 2 = 0 (z = 0) represents a 
hemisphere of radius a and center 0. 

Now for cumes C in line integrals, it was more practical and gave greater flexibility to 
use a parametric representation r = r(r), where a^t^b. This is a mapping of the interval 
a ^ ^ b, located on the /-axis, onto the curve C (actually a portion of it) in jryz-space. 

It maps every t in that interval onto the point of C with position vector r(/). See Fig. 239A. 



(A) Curve (B) Surface 

Fig. 239. Parametric representations of a curve and a surface 
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EXAMPLE 1 


EXAMPLE 2 


Similarly, for surfaces S in surface integrals, it will often be more practical to use a 
parametric representation. Surfaces are Two-dimensional. Hence we need two parameters, 
which we call u and v. Thus a parametric representation of a surface S in space is of 
the form 

(2) r(«, v) = [a*(m, v ), y(u , i>), z(u , o)] = x(u , o)i + y(w, v)j + z(u y u)k 


where (w, v) varies in some region R of the wo-piane. This mapping (2) maps every point 
(m, o) in R onto the point of S with position vector r(w, v). See Fig. 239B. 

Parametric Representation of a Cylinder 

The circular cylinder x 2 -I- y 2 = a 2 . —\ = z = 1. has radius a , height 2. and the z-axis as axis. A parametric 
representation is 

r(u, v) = [a cos u, a sin u,v] = ci cos u i + a sin t< j + uk (Fig. 240). 

The components of r are a* = a cos u t y = a sin u , z = v. The parameters u, v vary in the rectangle 
R: 0 = // = 2tt« — 1 ^ y 21 1 in the //u-planc. The curves u = co/m are vertical straight lines. The curves 
v = const are parallel circles. The point P in Fig. 240 corresponds to u - nf 3 = 60°, v = 0.7. ■ 



Fig. 240. Parametric representation 
of a cylinder 


z 



Fig. 241. Parametric representation 
of a sphere 


Parametric Representation of a Sphere 

A sphere .v 2 -I- v 2 + z 2 = a 2 can be represented in the form 

(3) r(//, v) = a cos v cos // i + a cos v sin u j 4* a sin v k 

where the parameters u, v vary in the rectangle R in the w-plane given by the inequalities 0 ^ // 2i r, 
- 7 t/ 2 ^ o ^ tt/ 2. The components of r are 

a- = a cos v cos y = o cos v sin m, z = a sin l>. 

The curves u = const and u = const are the “meridians” and “parallels” on S (see Fig. 241). This representation 
is used in geography for measuring the latitude and longitude of points on the globe. 

Another parametric representation of the sphere also used in mathematics is 

(3*) r («, v) — a cos u sin v i + a sin u sin v j 4- a cos v k 


where 0 ^ u ^ 2i r, 0 ^ v ^ tt. 
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EXAMPLE 3 


Parametric Representation of a Cone 

A circular cone z = V777 , 0 ^ t ^ H can be represented by 

r(w, y) = [fi cos v , u sin y , w] = u cos v i + u sin y j + «k, 

in components * = // cos y, y = u sin y, z = h. The parameters vary in the rectangle R: 0 ^ it ^ H, 0 ^ v ^ 27t. 
Check that a : 2 + y 2 = z 2 , as it should be. What are the curves it = const and v = const? ■ 


Tangent Plane and Surface Normal 

Recall from Sec. 9.7 that the tangent vectors of all the curves on a surface S through a 
point P of S form a plane, called the tangent plane of S at P (Fig. 242). Exceptions are 
points where S has an edge or a cusp (like a cone), so that S cannot have a tangent plane 
at such a point. Furthermore, a vector perpendicular to the tangent plane is called a normal 
vector of S at P. 

Now since 5 can be given by r = r(w, v) in (2), the new idea is that we get a curve C 
on S by taking a pair of differentiable functions 


u = u(t), V = v(t) 

whose derivatives u = du/dt and v' = dv/dt are continuous. Then C has the position 
vector ?(/) = r(w(/), By differentiation and the use of the chain rule (Sec. 9.6) we 
obtain a tangent vector of C on S 


. dr dr . dr , 
?'(/) = — = —«' + — v’. 
(it d u dv 


Hence the partied derivatives r u and r„ at P are tangential to S at P. We assume that they 
are linearly independent, which geometrically means that the curves u = const and 
v = const on S intersect at P at a nonzero angle. Then r ti and r„ span the tangent plane 
of S at P. Hence their cross product gives a normal vector N of S at P. 


(4) 


N = r u x r„ # 0. 


The corresponding unit normal vector n of S at P is (Fig. 242) 


(5) 


n = -At N = 1 r 

M \r u x rj 



Fig. 242. Tangent plane and normal vector 
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Also, if S is represented by g(*, y, z) = 0, then, by Theorem 2 in Sec. 9.7, 

m n - iiddii 8n,ds - 

A surface 5 is called a smooth surface if its surface normal depends continuously on 
the points of 5. 

S is called piecewise smooth if it consists of finitely many smooth portions. 

For instance, a sphere is smooth, and the surface of a cube is piecewise smooth 
(explain!). We can now summarize our discussion as follows. 


THEOREM 1 


Tangent Plane and Surface Normal 

If a surface S is given by (2) with continuous r u = dr/du and r v = drldv satisfying 
(4) at every point of S, then S has at every point P a unique tangent plane passing 
through P and spanned by r u and r y , and a unique normal whose direction depends 
continuously on the points of S. A normal vector is given by (4) and the 
corresponding unit normal vector by (5). (See Fig. 242.) 


EXAMPLE 4 Unit Normal Vector of a Sphere 

From (5*) we find that the sphere g(x, y , z) = x 2 + y 2 + z 2 - a 2 = 0 has the unit normal vector 


n(x, y, z) 



We see that n has the direction of the position vector [.v, y. 
must be the case? 


a a a 


2 ] of the corresponding point. Is it obvious that this 


EXAMPLE 5 Unit Normal Vector of a Cone 

At tiie apex of the cone g(x. y, z) = — z + V?77 = 0 in Example 3, the unit normal vector n becomes 
undetermined because from (5*) we get 


L V 2(.v 2 + y 2 ) V2(.v 2 + y 2 ) ^ J V2 \ Vx 2 + y 2 # Vx 2 + y 2 # ' 

We are now ready to discuss surface integrals and their applications, beginning in the next 
section. 


M aLe. - aj =£M-= SEE = E B E:5:' ~ 


1 1-10| PREPARATION FOR SURFACE INTEGRALS: 
PARAMETRIC REPRESENTATION, 
NORMAL 

Familiarize yourself with parametric representations of 
important surfaces by deriving a representation (1 ), by finding 
the parameter curves (curves u — const and v = const) of 
the surface and a normal vector N = r u x r v of the surface. 
(Show the details of your work.) 

1. Ary-plane r («, v) = [u, v] (thus u\ -I- uj; similarly in 
Probs. 2-10) 


2. Ary-plane in polar coordinates 

r(w, v) = [u cos Vy u sin v] (thus it = r, v = 0) 

3. Elliptic cylinder r(w, v) = [a cos u, b sin u, u] 

4. Paraboloid of revolution 

r(w, v) = |m cos Vy u sin v y it 2 ] 

5. Cone r (u 9 v) = [an cos u, an sin u. cu ] 

6. Hyperbolic paraboloid 

r(w, v) = [4 u cosh u, u sinh u, u 2 ] 

7. Elliptic paraboloid r(«, v) = [3 u cos u, 4 u sin o, it 2 ] 




SEC. 10.6 Surface Integrals 


449 


8. Helicoid r(w, v) = [u cos v, u sin v, v]. Explain the 

name. 

9. Ellipsoid 

r(«, v ) = [2 cos v cos u , 3 cos i> sin w, 4 sin u] 

10. Ellipsoid 

r (w, u) = [r/ cos u cos w, & cos o sin w, c sin y] 

11. CAS EXPERIMENT. Graphing Surfaces, 
Dependence on a, b y c. Graph the surfaces in Probs. 
l-l 0. In Probs. 6-9 generalize the surfaces by 
introducing parameters a , b, c and then find out in 
Probs. 3-10 how the shape of the surfaces depends on 
a , Z>. c. 

12-19 1 DERIVATION OF PARAMETRIC 
REPRESENTATIONS 

Find a parametric representation and a normal vector. (The 
answer gives one of them. There are many.) 

12. Plane 5 a + v — 3z = 30 

13. Plane 4 a* - 2y + lOz = 16 

14. Sphere (a - l) 2 + (y 4- 2) 2 + z 2 = 25 

15. Sphere (a + 2) 2 + y 2 + (z ~ 2) 2 = 1 

16. Elliptic paraboloid z = 4 a 2 + y 2 

17. Parabolic cylinder z = 3y 2 

18. Hyperbolic cylinder 9 a 2 - 4y 2 = 36 

19. Elliptic cone z = V9a 2 + y 2 


20. (Representation z = /(a, y)) Show that z = /(a, y) or 
g = z - /(a, y) = 0 can be written (/ u = a//aw, etc.) 

r (w, u) = [n, u, /(m, u)] and 

( 6 ) 

N = grad « = [-/„, 1]. 

21. (Orthogonal parameters) Show that the parameter 
curves u = const and v = const on a surface r (//, u) 
are orthogonal (intersect at right angles) if and only if 
r u T„ = 0. 

22. (Condition (4)) Find the points in Probs. 2-7 at which 
(4) N # 0 does not hold and state whether this is owing 
to the shape of the surface or to the choice of the 
representation. 

23. (Change of representation) Represent the paraboloid 
in Prob. 4 so that N(0, 0) =£ 0, and show N. 

24. PROJECT. Tangent Planes T(P) will be less 
important in our work, but you should know how to 
represent them. 

(a) If S : r (w, v ), then T(P): (r* - r r w r v ) = 0 
(a scalar triple product) or 

r*(p, q) = r(P) + pr u (P) + qr v (P). 

(b) If 5: g( a, y, z) = 0, then T(P): (r* - r (F))*Vg = 0. 

(c) If S: z = /(a, y), then 

T(P): z*-z = (a* - a )f x (P) T (y* - y)/ v (P)). 
Interpret (a) -(c) geometrically. Give two examples for 
(a), two for (b), and two for (c). 


10.6 Surface Integrals 

To define a surface integral, we take a surface S, given by a parametric representation as 
just discussed, 

(1) r («, v) = [x (a, v ), y(w, u), z(u , u)] = a(m, u)i + y(w, u)j + z(w, u)k 

where (w, u) varies over a region R in the wu-plane. We assume S to be piecewise smooth 
(Sec. 10.5), so that S has a normal vector 


(2) N = r u x !•„ and unit normal vector n = -j^j- N 

at every point (except perhaps for some edges or cusps, as for a cube or cone). For a given 
vector function F we can now define the surface integral over S by 

JjF*ndA = J J F(r (it, u))*N(w, v) du dv. 

s R 


( 3 ) 
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Here N = |N|n by (2), and |N| = |r u x r v \ is the area of the parallelogram with sides r u 
and r v , by the definition of cross product. Hence 

(3*) n dA = n |N| du dv = N du dv. 


And we see that dA = |N| du dv is the element of area of S . 

Also F*n is the normal component of F. This integral arises naturally in flow problems, 
where it gives the flux across S (— mass of fluid crossing S per unit time; see Sec. 9.8) 
when F = pv. Here, p is the density of the fluid and v the velocity vector of the flow 
(example below). We may thus call the surface integral (3) the flux integral. 

We can write (3) in components, using F = [F x , F 2 , F 3 ], N = [N l9 1V 2 , W 3 ], and 
n = [cos a, cos /?, cos y], Here, a, (3, y are the angles between n and the coordinate axes; 
indeed, for the angle between n and i, formula (4) in Sec. 9.2 gives cos a = n*i/|n||i| = n*i, 
and so on. We thus obtain from (3) 

JJf*u dA = JJ(F X cos a -f F 2 cos /3 + F 3 cos y) dA 
s s 

(4) rr 

= J J (FiNi + F 2 N 2 + F 3 N 3 ) du dv. 

R 

In (4) we can write cos a dA = dy dz , cos /3 dA = dz dx , cos y dA = dx dy. Then (4) 
becomes the following integral for the flux: 

(5) J J F»n dA = JJ (F t dy dz 4- F 2 dz dx *f F z dx dy). 

s s 

We can use this formula to evaluate surface integrals by converting them to double integrals 
over regions in the coordinate planes of the jcyz-coordinate system. But we must carefully 
take into account the orientation of S (the choice of n). We explain this for the integrals 
of the F 3 -terms, 

(5') JJf 3 cos y dA = j j F 3 dx dy. 

s s 

If the surface S is given by z — h(x y y) with (x, y) varying in a region R in the xy-plane, 
and if S is oriented so that cos y > 0, then (5 ; ) gives 

(5") f ff 3 cosy dA = + j j F 3 (x, y, h{x, y)) dx dy. 

S R 

But if cos y < 0, the integral on the right of (5 /; ) gets a minus sign in front. This follows 
if we note that the element of area dx dy in the jcy-plane is the projection (cos y | dA of 
the element of area dA of S; and we have cos y = + |cos y\ when cos y > 0, but 
cos y = — |cos y\ when cos y < 0. Similarly for the other two terms in (5). At the same 
time, this justifies the notations in (5). 

Other forms of surface integrals will be discussed later in this section. 
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EXAMPLE 1 


EXAMPLE 2 


Flux Through a Surface 

Compute the flux of water through the parabolic cylinder S: y = .v 2 0 ^ x ^ 2, 0 ^ z ^ 3 (Fig. 243) if the 
velocity vector is v = F = [3 z 2 6, 6*z], speed being measured in meters/sec. (Generally, F = pv, but water 
has the density p = 1 gm/cm 3 = 1 ton/m 3 .) 


z 



Fig. 243. Surface S in Example 1 


Solution . Writing = u and z = v, we have y = .v 2 = m 2 . Hence a representation of S is 

S: r = [«, m 2 , m] (0 = m. = 2, 0 = u = 3). 

By differentiation and by tiie definition of the cross product, 

N = x r tf = [1, 2«, 0] x [0. 0, I] = [2 m, -1, 0]. 

On £ writing simply F(S) for F[r(M, {/)], we have F (S) = [3m 2 , 6, 6 mi;]. Hence F(S)*N = 6 mi; 2 - 6. By 
integration we thus get from (3) the flux 


/•3 .2 


J J F»n dA = J j (6 uv 2 - 6)dudv = J (3m 2 m 2 - 


■'o-'o 

.3 


6 m) 


dv 


= j (I2 m 2 - 12 )dv = (4m 3 - 12m) 


= 108 - 36 = 72 [m 3 /sec] 


y-0 


or 72 000 liters/sec. Note that the y-component of F is positive (equal to 6), so that in Fig. 243 the flow goes 
from left to right. 

Let us confirm this result by (5). Since 

N = |N|n = |N|[cos a, cos j3, cos yj = [2m, — 1, 0] = [2*, -1, 0] 

we see that cos a > 0, cos p < 0, and cos y = 0. Hence the second term of (5) on the right gets a minus sign, 
and the last term is absent. This gives, in agreement with the previous result. 

[ [ 3z z dy dz — f [ 6 dzdx = f 4(3 z 2 ) dz - [ 6-3 dx = 4-3® - 6-3-2 = 72. ■ 
dj 0 *'0 d t\ d(\ J n 


'o-'o 


Surface Integral 

Evaluate (3) when F = [a* 2 , 0, 3y 2 ] and S is the portion of the plane x + y + z = 1 in the first octant 
(Fig. 244). 



Fig. 244. Portion of a plane in Example 2 
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THEOREM 1 


EXAMPLE 3 


Solution . Writing x = u and y = v f we have z = \ — x - y = \ - u - u. Hence we can represent the plane 
a- + y + z = I in the form r(«, v) = [m, v % 1 - « - (/]. We obtain the first-octant portion S of this plane by restricting 
x = u and y = v to the projection R of 5 in the jry-plane. R is the triangle bounded by the two coordinate axes and 
the straight line x + y = I , obtained from x + y + z = 1 by setting z = 0. Thus 0 ^ x • ^ I — y\ 0 ^ y ^ 1. 

By inspection or by differentiation, 


N = r u x iv = [1, 0, -1] x [0, I, -1] = [1, I. 1]. 

Hence F(S)«N = [« 2 , 0, 3u 2 ] •[!, I, 1] = » 2 + 3y 2 . By (3). 

I J F*n clA = I [ (« 2 + 3 v 2 )dudv = (« 2 + 3 u z )dudu 

V r J 0 J o 

= J [y (1 - v) 3 + 3t> 2 (l - t>)J dv - j . 


Orientation of Surfaces 

From (3) or (4) we see that the value of the integral depends on the choice of the unit 
normal vector n. (Instead of n we could choose — n.) We express this by saying that such 
an integral is an integral over an oriented surface 5, that is, over a surface S on which 
we have chosen one of the two possible unit normal vectors in a continuous fashion. (For 
a piecewise smooth surface, this needs some further discussion, which we give below.) 
If we change the orientation of 5, this means that we replace n with — n. Then each 
component of n in (4) is multiplied by — l , so that we have 


Change of Orientation in a Surface Integral 

The replacement ofnby — n ( hence of N by — N) corresponds to the multiplication 
of the integral in (3) or (4) by — 1 . 


How do we effect such a change of N in practice if S is given in the form (1)? The 
simplest way is to interchange u and v , because then r u becomes r v and conversely, so 
that N = r u x r v becomes r v x r u = -r u x r v = as wanted. Let us illustrate this. 


Change of Orientation in a Surface Integral 

In Example 1 we now represent S by r = [u, u 2 , n], 0 ^ v ^ 2, 0 u ^ 3. Then 


N = r u x f v = [0, 0, 1] x [1, 2v , 0] = [-2u, 1, 0J. 


For F = [3c 2 . 6, 6a*z] we now get F (5) = [3« 2 6, 6uv]. Hence F(5) # N = -6 u 2 v + 6 and integration gives 
the old result times - 1, 


f f F(S)'fUvc/u = J J (-6 u 2 v + 6 ) dv du = J (-12 ti 2 + 12 )du = -72. 


Orientation of Smooth Surfaces 

A smooth surface S (see Sec. 10.5) is called orientable if the positive normal direction, 
when given at an arbitrary point P Q of S, can be continued in a unique and continuous 
way to the entire surface. For smooth surfaces occurring in applications this is always 
true. 
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(b) Piecewise smooth surface 
Fig. 245. Orientation of a surface 


Orientation of Piecewise Smooth Surfaces 

Here the following idea will do it. For a smooth orientable surface S with boundary curve 
C we may associate with each of the two possible orientations of S an orientation of C, 
as shown in Fig. 245a. Then a piecewise smooth surface is called orientable if we can 
orient each smooth piece of S so that along each curve C* which is a common boundary 
of two pieces S ± and S 2 the positive direction of C* relative to S 1 is opposite to the direction 
of C* relative to S 2 * See Fig. 245b for two adjacent pieces; note the arrows along C*. 

Theory: Nonorientable Surfaces 

A sufficiently small piece of a smooth surface is always orientable . This may not hold for 
entire surfaces. A well-known example is the Mobius strip 5 , shown in Fig. 246. To make 
a model, take the rectangular paper in Fig. 246, make a half-twist, and join the short sides 
together so that A goes onto A, and B onto B. At P 0 take a normal vector pointing, say, 
to the left. Displace it along C to the right (in the lower part of the figure) around the strip 
until you return to P 0 and see that you get a normal vector pointing to the right, opposite 
to the given one. See also Prob. 21. 



5 AUGUST FERDINAND MOBIUS (1790-1868), German mathematician, student of Gauss, known for his 
work in surface theory, geometry, and complex analysis (see Sec. 17.2). 
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EXAMPLE 4 


EXAMPLE 5 


Surface Integrals Without Regard to Orientation 

Another type of surface integral is 

(6) J J G( r) dA = JJ G(r(u, u))|N («, u)| du dv. 

S R 

Here clA = |N| du dv = |r M x rj du dv is the element of area of the surface S represented 
by (1) and we disregard the orientation. 

We shall need later (in Sec. 10.9) the mean value theorem for surface integrals, which 
states that if R in (6) is simply connected (see Sec. 10.2) and G( r) is continuous in a 
domain containing R , then there is a point (w 0 » t>o) * n R such that 

(7) JjG(r)r/A = G(r(« 0 , v Q ))A (A = Area of S). 

s 

As for applications, if G(r) is the mass density of S, then (6) is the total mass of S. If 
G = 1, then (6) gives the area A(S) of S, 

(8) A(S) = JJdA = JJ |r„ x rj dudv. 

S R 


Examples 4 and 5 show how to apply (8) to a sphere and a torus. The final example. 
Example 6, explains how to calculate moments of inertia for a surface. 


Area of a Sphere 

For a sphere r(u, v ) = [a cos v cos «, a cos v sin «, a sin v], 0 ^ u = 2 tt. -77/2 ^ v ^ 7r/2, [see (3) 
in Sec. 10.5] we obtain by direct calculation (verify!) 

r n x r v = [a 2 cos 2 v cos « t a 2 cos 2 v sin m, a 2 cos v sin v]. 

Using cos 2 u + sin 2 u = 1 and then cos 2 v + sin 2 v = 1, we obtain 

| r u x rj = a 2 (cos 4 v cos 2 u + cos 4 v sin 2 it + cos 2 v sin 2 v) l/2 = a 2 |cos u|. 


With this, (8) gives the familiar formula (note that |cos u| = cos v when —tt/2 ^ v ^ irt2) 


/ ▼r/2 ,.277- 

J |cos o| du dv = 2'ira 2 

-rr/2 J 0 


pir/2 

J rr/O 


cos v dv = 4-t ra 2 . 


-tt/2 


Torus Surface (Doughnut Surface): Representation and Area 

A writs surface S is obtained by rotating a circle C about a straight line L in space so that C does not intersect 
or touch L but its plane always passes through L. If L is the z-axis and C has radius b and its center has distance 
a (> b) from L as in Fig. 247, then 5 can be represented by 

r(n, v) = (a + b cos v) cos u i + (a + b cos o) sin it j + b sin v k 


where 0 ^ u % 2 tt, 0 ^ v ^ 2 it. Thus 


r tt = -(« + /? cos v ) sin ui + (a + b cos u) cos u j 
r v = -h sin v cos ui - b sin v sin u j + b cos u k 
r u xr v = b(a + b cos u)(cos u cos v i + sin u cos v j + sin v k). 



SEC 10.6 Surface Integrals 


455 


EXAMPLE 6 


Hence |r tt x rj = b(a + b cos u), and (8) gives the total area of the torus. 


(9) 


,2tt Jfcir 


A(S) 


-It 


b{a + b cos v) du dv = 4tAi£. 



Fig. 247. Torus in Example 5 


Moment of Inertia of a Surface 

Find the moment of inertia / of a spherical lamina S: x 2 + y 2 + z 2 = a 2 of constant mass density and total 
mass M about the “-axis. 

Solution. If a mass is distributed over a surface S and fx(x. y, z ) is the density of the mass (= mass per unit 
area), then the moment of inertia / of the mass with respect to a given axis L is defined by the surface integral 


( 10 ) 


-// 


fiD 2 dA 


where D(x, y, z ) is the distance of the point (x,.y, z) from L. Since, in the present example, fx is constant and S 
has the area A — 4 ira 2 , we have fi = M/A = M(4ira 2 ). 

For S we use the same representation as in Example 4. Then D 2 = x 2 + y 2 - a 2 cos 2 v. Also, as in tliat example, 
dA = a 2 cos v du dv. This gives the following result, [in the integration, use cos 3 v = cos v (1 — sin 2 u).] 



M 

4 ira 2 



cos 3 v du dv = 


Ma 2 r ,Z 

2 J-w 2 


cos 3 v dv = 


IMa 2 
3 ’ 


Representations z = f(x 9 y). If a surface 5 is given by z = f(x, y), then setting « = jc, 
v = )\r = [w, u, /] gives 

|N| = |r w x r„| = |[1, 0, f u ] x [0, 1, /J| = |[-/ u , 1]| = Vl + /„ 2 + J} 

and, since f u = f x , /„ = f y , formula (6) becomes 


f f G(r) dA = / J G(x, y, f(x, y)) 

* R* 



dxdy. 


( 11 ) 
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Here R* is the projection of S into the jry-plane (Fig. 248) and the normal vector N on S 
points up. If it points down, the integral on the right is preceded by a minus sign. 

From (11) with G = 1 we obtain for the area A(S) of 5: z = fix , y) the formula 

a» ms) - fj 7> + (If + (f J * * 

where R* is the projection of S into the xv-plane, as before. 




|l-12| FLUX INTEGRALS (3) J F*n dA 

Evaluate these integrals for the following data. Indicate the 
kind of surface. (Show the details of your work.) 

1. F = [2a\ 5y, 0], S: r = [m, v. An 4- 3u], 

0 = u= 1, — 8 = u = 8 

2. F = [a* 2 , y 2 , z 2 ], 

5:A' + y + s = 4 t ^0,^0,i^0 

3. F = [a* - z, y - x ; z - y]. 

S : r = [m cos u, u sin u, w], 0 ^ u ^ 3, 0 ^ v ^ 7r 

4. F = [*» -* 2 T «*], 

5: a* 2 4 y 2 — 9, a* = 0, y = 0, 0 = z = 2 

5. F = [a, y, z], 5: r = [zz cos u, w sin v , zz 2 ], 

0 ^ ZZ ^ 4, — 7T ^ u ^ 7T 

6. F = [cosh yz. 0, y 4 ], 

5: y 2 + z 2 = 1, 0 = a = 20. z i= 0 

7. F = [1, 1, 1], S the sphere of radius 1 and center 0 

8. F = [tan Ay, A 2 y, — z], S: y 2 4 §z 2 = 1, 1 ^ a ^ 4 

9. F = [0, a*. 0], 

5: a 2 4 y 2 4 z 2 = a 2 , x ^ 0, y ^ 0, z ^ 0 

10. F = [y 2 , a 2 , z 4 ], 

S: z = 4VA- 2 4 y 2 , 0 ^ z ^ 8, y ^ 0 

11. F = [y 3 , a 3 , z 3 ], 

S: a 2 4 4y 2 = 4, a ^ 0, y ^ 0, 0 ^ z ^ h 

12. F = [coshy, 0, sinhA*], 

S: z = a 4 y 2 , 0 = y = a*, 0 = a =i 1 


13. CAS EXPERIMENT. Write a program for evaluating 
surface integrals (3) that prints intermediate results 
(F, F*N, the integral over one of the two variables). 
Can you experimentally obtain rules on functions and 
surfaces giving integrals that can be evaluated by the 
usual methods of calculus? Make a list of positive and 
negative results. 


14-20 


SURFACE INTEGRALS 


(6) JjG(r)dA 


Evaluate these integrals for the following data. Indicate the 
kind of surface. (Show the details.) 


14. G = cos y 4 sin a. 

S.‘ a 4 y 4 z = 2, a = 0, y = 0, z = 0 

15. G = 5(a 4 y 4 z), 

S: z = a 4 2y, 0 = y = a, 0 = a = 2 

16. G = ye x 4 xe y 4 e 2 , 

S: a 2 4 y 2 = 16,y ^ 0, 0 ^ z g 4 

17. G = (x 2 + y 2 + z 2 ) 2 , S: z = V.v 2 + y 2 , y g 0, 

0 S z g 2 

18. G = ax 4 by 4 cz, S: a 2 4 y 2 4 z 2 = 1 , y = 0, z = 0 

19. G = arctan (y/jt), 

5: z = A 2 4 y 2 , l = z = 9, a = 0, y = 0 

20. G = 3Ay T S: z = Ay, O^A^l.O^y^l 


21. (Fun with Moblus) Make Mobius strips from long 
slim rectangles R of grid paper (graph paper) by pasting 
the short sides together after giving the paper a half- 
twist. In each case count the number of parts obtained 
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by cutting along lines parallel to the edge, (a) Make R 
three squares wide and cut until you reach the 
beginning, (b) Make R foursquares wide. Begin cutting 
one square away from the edge until you reach the 
beginning. Then cut the portion that is still two squares 
wide, (c) Make R five squares wide and cut similarly, 
(d) Make R six squares wide and cut. Formulate a 
conjecture about the number of parts obtained. 

APPLICATIONS 

22. (Center of gravity) Justify the following formulas for 
the mass M and the center of gravity ( x , y, z) of a lamina 
S of density (mass per unit area) cr(x, y, z) in space: 



s s 


23. (Moments of inertia) Justify the following formulas 
for the moments of inertia of the lamina in Prob. 22 
about the x y- f and z- axes, respectively: 

f x = / f ( y 2 + z 2 )crdA t J y = ff C * 2 + z z )crdA , 
s s 

f z = ff (X 2 + y 2 )(TdA. 

s 

24. Find a formula for the moment of inertia of the lamina 
in Prob. 22 about the line y = x, z = 0. 

Find the moment of inertia of a lamina S of density 1 
about an axis A, where 

25. S: x 2 + y 2 = 1 , 0 ^ z ^ /?, A: the z-axis 

26. S as in Prob. 25, A: the line z = h/2 in the A*z-plane 

27. S: x 2 -I- y 2 = z 2 , 0 ^ z = h , A: the z-axis 

28. (Steiner’s theorem 6 ) If / A is the moment of inertia of 
a mass distribution of total mass M with respect to an 
axis A through the center of gravity, show that its 
moment of inertia I B with respect to an axis B , which 
is parallel to A and has the distance k from it, is 


(13) ds 2 = E dtt 2 + 2 F du du + G dv 2 
with coefficients 

(14) F=r tt *r w , F = r u *r v , G = r v *r v 

is called the first fundamental form of S. (£, F, G are 
standard notations that have nothing to do with F and 
G that occur at some other places in this chapter.) The 
first fundamental form is basic in the theory of surfaces, 
since with its help we can determine lengths, angles, 
and areas on S. To show this, prove the following. 

(a) For a curve C: u = «(f), v ~ v(t ), a ^ t ^ b y on 
5, formulas (10), Sec. 9.5, and (14) give the length 

r *> 

/= I Vr '(/)*r’(/)rf( 

} ' 

= I V£m ' 2 + iFTv' + Go' 2 dt. 

J a 

(b) The angle y between two intersecting curves 
C,: u = g(t ), v = /i(0 and C 2 : u = p(t ), u = q(t) on 
S: r («, v) is obtained from 

(16) cosy= m\ 

where a = r n g' + r v !i and b = r u p' + r v q are 
tangent vectors of C x and C 2 . 

(c) The square of the length of the normal vector N 
can be written 

(17) |N| 2 = |r„ x r„| 2 = EG - F 2 , 

so that formula (8) for the area A(S) of 5 becomes 


A(S) = jfdA = // M dudv 
( 18 ) S R 

= J J VEG - F 2 du dv. 

R 

(d) For polar coordinates u (= /*) and v (= 6) defined 
by a = w cos u, y = u sin v we have E = 1, F = 0, 
G = w 2 , so that 


l B = I a + k*M. 


ds 2 = du 2 + u 2 dv 2 = dr 2 + r 2 d0 2 . 


29. Using Steiner’s theorem, find the moment of inertia of 
S in Prob. 26 about the x-axis. 

30. TEAM PROJECT. First Fundamental Form of a 
Surface. Given a surface S: r(w, u), the corresponding 
quadratic differential form 


Calculate from this and (18) the area of a disk of 
radius a. 

(e) Find the first fundamental form of the torus in 
Example 5. Use it to calculate the area A of the torus. 
Show that A can also be obtained by the theorem of 


6 JACOB STEINER (1796-1863), Swiss geometer, born in a small village, learned to write only at age 14, 
became a pupil of Pestalozzi at 18. later studied at Heidelberg and Berlin and, finally, because of his outstanding 
research, was appointed professor at Berlin University. 
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Pappus , 7 which states that the area of a surface of 
revolution equals the product of the length of a 
meridian C and the length of the path of the center of 
gravity of C when C is rotated through the angle 2 tt. 


(£) Calculate the first fundamental form for the usual 
representations of important surfaces of your own 
choice (cylinder, cone, etc.) and apply them to the 
calculation of lengths and areas on these surfaces. 


10 .’ Triple Integrals. 

Divergence Theorem of Gauss 

In this section we discuss another “big” integral theorem, the divergence theorem, which 
transforms surface integrals into triple integrals. So let us begin with a review of the latter. 

A triple integral is an integral of a function f(x 9 y, z) taken over a closed bounded 
(three-dimensional) region T in space (where “closed” and “bounded” are defined as in 
footnote 2 of Sec. 10.3, with “sphere” substituted for “circle”). We subdivide T by planes 
parallel to the coordinate planes. Then we consider those boxes of the subdivision 
(rectangular parallelepipeds) that lie entirely inside T , and number them from 1 to n. In 
each such box we choose an arbitrary point, say, (x k , y fc , z k ) in box k. The volume of box 
k we denote by AV k . We now form the sum 

n 

Jn = 2 /(•**, 3’fc. Zk) AVfc. 

/c=l 

This we do for larger and larger positive integers n arbitrarily but so that the maximum 
length of all the edges of those n boxes approaches zero as n approaches infinity. This 
gives a sequence of real numbers J n ^ J ^ • • • . We assume that f(x, y , z) is continuous in 
a domain containing T, and T is bounded by finitely many smooth surfaces (see Sec. 10.5). 
Then it can be shown (see Ref. [GR4] in App. 1 ) that the sequence converges to a limit 
that is independent of the choice of subdivisions and corresponding points (x fc , y fe , Zk )• This 
limit is called the triple integral of f(x , y, z) over the region T and is denoted by 

1 1 - v ’ dxdydz or by jjj f(x, y, z) dV. 

T T 

Triple integrals can be evaluated by three successive integrations. This is similar to the 
evaluation of double integrals by two successive integrations, as discussed in Sec. 10.3. 
An example is shown below (Example 1). 

Divergence Theorem of Gauss 

Triple integrals can be transformed into surface integrals over the boundary surface of a 
region in space and conversely. Such a transformation is of practical interest because one 
of the two kinds of integral is often simpler than the other. It also helps in establishing 
fundamental equations in fluid flow, heat conduction, etc., as we shall see. The 
transformation is done by the divergence theorem, which involves the divergence of a 
vector function F = [F l9 F 2 , F 3 ] = F x i + F 2 j + F 3 k, namely. 


7 PAPPUS OF ALEXANDRIA (about a.d. 300), Greek mathematician. The theorem is also called Guldin’s 
theorem. HABAKUK GULD1N (1577-1643) was bom in St. Gallen, Switzerland, and later became professor 
in Graz and Vienna. 
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THEOREM 1 


EXAMPLE 1 



l 

i 

i 

i 





Fig. 249. Surface 5 
in Example 1 


(1) 


div F = 


dF\ 

dx 


+ 


dF 2 i dF z 

dy dz 


(Sec. 9.8). 


Divergence Theorem of Gauss 

(Transformation Between Triple and Surface Integrals) 

Let T be a closed bounded region in space whose boundary is a piecewise smooth 
orientable surface S. Let F(jt, y , z) be a vector function that is continuous and has 
continuous first partial derivatives in some domain containing T. Then 


( 2 ) 


JJ Jdiv FdV = J J F»n dA. 

T S 


In components of F = [F 1? F 2 , F 3 ] and of the outer unit normal vector 
n = [cos a, cos /3, cos y] ofS (as in Fig. 250), formula (2) becomes 


(2*) 



+ 



dxdydz 


= JJ(Fi cos a + F 2 cos /3 -f F 3 cos y) dA 
s 

= j J(F ± dy dz + F 2 dzdx + F 3 dx dy). 
s 


The proof follows after Example 1. “Closed bounded region” is explained above, 
“piecewise smooth orientable” in Sec. 10.5, and “domain containing 7” in footnote 4, 
Sec. 10.4, for the two-dimensional case. 


Evaluation of a Surface Integral by the Divergence Theorem 

Before we prove the theorem, let us show a typical application. Evaluate 


/ 



+ .v 2 y dz dx + x 2 z dx dy) 


where S is the closed surface in Fig. 249 consisting of the cylinder x 2 4- y 2 = a 2 (0 z = b) and the circular 
disks z = 0 and z = b (x 2 + y 2 ^ a 2 ). 


Solution. Fi = .v 3 . F 2 = x z y. F 3 = x\ Hence div F = 3x 2 + .t 2 + .v 2 = 5.v 2 . The form of the surface 
suggests that we introduce polar coordinates r, 6 defined by x = r cos 0, y = r sin 6 (thus cylindrical coordinates 
r, fl, z). Then the volume element is dx dy dz = r dr d6 dz , and we obtain 


2 cos 2 0)r drdQdz 


/= [ff 5x 2 dxdydz = f f / (5 r ! 

J if J z = 0 J ff-0 J rmO 

b 2ir 4 b 4 ^ 

f f a 9 f fl 7T 57 T ^ 

= 5 1 I — cos 2 8d8dz = 5 I — — dz = —— a%. 

•'r-o-Vo 4 J *-0 4 4 
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PROOF We prove the divergence theorem, beginning with the first equation in (2*). This equation 


is true if and only if the 

integrals of each component on both sides are equal; that is, 

(3) 

/// ^ dy dz = J JV a cos a dA , 

J J T J <$x V 

(4) 

HI d JC dy dz = J J F 2 cos ft dA , 

J T J dy s 

(5) 

/// ' dx dy dz = J Jf 3 cos ydA. 

T s 


We first prove (5) for a special region T that is bounded by a piecewise smooth 
orientable surface S and has the property that any straight line parallel to any one of the 
coordinate axes and intersecting T has at most one segment (or a single point) in common 
with T. This implies that T can be represented in the form 


( 6 ) 


y) = z = Kx, y) 


where (r, y ) varies in the orthogonal projection R of T in the xy-plane. Clearly, 
z = g(x , y) represents the “bottom” S 2 of S (Fig. 250), whereas z = h{x, y) represents the 
“top” of S, and there may be a remaining vertical portion S 3 of S. (The portion S 3 may 
degenerate into a curve, as for a sphere.) 

To prove (5), we use (6). Since F is continuously differentiable in some domain 
containing T, we have 


(7) 


ffJf.A 4 A.ff [/“’■'’iSu 

J T [Jgtx, y) dz 


dz dxdy. 


Integration of the inner integral [• • •] gives F 3 [x, y, h(x, y)] — F 3 [jc, y, g(x, y)]. Hence the 
triple integral in (7) equals 


( 8 ) 


J / F 9 [x, y> Kx, y)] dx dy - fj F 3 [x, y, g(x, y)] dx dy. 

R R 




Fig. 250. Example of a special region 
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EXAMPLE 2 


But the same result is also obtained by evaluating the right side of (5); that is [see also 
the last line of (2*)], 

J J F 3 cos ydA = JJV 3 dxdy 
s s 

= + J J F 3 [a-, y, h(x, >0] dx dy - JJ F 3 [x, y, g(x, v)] dx dy, 

R R 

where the first integral over R gets a plus sign because cos y > 0 on S 1 in Fig. 250 [as 
in (5"), Sec. 10.6], and the second integral gets a minus sign because cos y < 0 on S 2 . 
This proves (5). 

The relations (3) and (4) now follow by merely relabeling the variables and using the 
fact that, by assumption, T has representations similar to (6), namely, 

g(y, z) = a = My, z ) and f (z, a) ^ y ^ /7(z, a). 

This proves the first equation in (2*) for special regions. It implies (2) because the left side 
of (2*) is just the definition of the divergence, and the right sides of (2) and of the first 
equation in (2*) are equal, as was shown in the First line of (4) in the last section. Finally, 
equality of the right sides of (2) and (2*), last line, is seen from (5) in the last section. 
This establishes the divergence theorem for special regions. 

For any region T that can be subdivided into finitely many special regions by means of 
auxiliary surfaces, the theorem follows by adding the result for each part separately; this 
procedure is analogous to that in the proof of Green’s theorem in Sec. 10.4. The surface 
integrals over the auxiliary surfaces cancel in pairs, and the sum of the remaining surface 
integrals is the surface integral over the whole boundary surface S of T; the triple integrals 
over the parts of T add up to the triple integral over T. 

The divergence theorem is now proved for any bounded region that is of interest in 
practical problems. The extension to a most general region T of the type indicated in the 
theorem would require a certain limit process; this is similar to the situation in the case 
of Green’s theorem in Sec. 10.4. ■ 

Verification of the Divergence Theorem 

Evaluate J J (7jci - zk)»n dA over the sphere S: x 2 + y 2 + z 2 = 4 (a) by (2), (b) directly, 
s 

Solution, (a) div F = div [7 a-, 0, -z] = div [7-vi - zk] = 7 - I = 6. Answer: 6 • (4/3)tt- 2 3 = 64ir. 

(b) We can represent 5 by (3), Sec. 10.5 (with a — 2), and we shall use n dA - N du dv Lsee (3*), Sec. 10.6]. 
Accordingly, 

S: r = [2 cos v cos //, 2 cos v sin u r 2 sin v]. 

Then r w = [-2 cos v sin //, 2 cos v cos «, 0] 

r v — [—2 sin v cos //, — 2 sin v sin u. 2 cos v] 

N - r u x r„ = [4 cos 2 v cos m, 4 cos 2 v sin u, 4 cos v sin u]. 

Now on S we have x = 2 cos v cos m. z = 2 sin u, so that F = [7.v, 0, — z] becomes on 5 

F (5) = 1 14 cos v cos m, 0. -2 sin y] 

F(5)»N = (14 cosy cos«)*4cos 2 y cosh + (-2 sin y)* 4 cosy sin v 
= 56 cos 3 y cos 2 // — 8 cos u sin 2 y. 


and 
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On S we have to integrate over u from 0 to 27T. This gives 

7 t • 56 cos 3 v — 27r • 8 cos v sin 2 v. 

The integral of cos v sin 2 v equals (sin 3 u)/3, and that of cos 3 v = cos v (1 - sin 2 v) equals sin v - (sin 3 v)t3. 
On S we have -tt/2 ^ o ^ 7r/2, so that by substituting these limits we get 

56*7(2 - 2/3) - 16tt-2/3 = 64 tt 

as hoped for. To see the point of Gauss's theorem, compare the amounts of work. II 

Coordinate Invariance of the Divergence. The divergence (1) is defined in terms of 
coordinates, but we can use the divergence theorem to show that div F has a meaning 
independent of coordinates. 

For this purpose we first note that triple intgrals have properties quite similar to those 
of double integrals in Sec. 10.3. In particular, the mean value theorem for triple integrals 
asserts that for any continuous function /(jc, y, z) in a bounded and simply connected 
region T there is a point Q: (a* 0 , yo, -o) * n T such that 

(9) fff f(x , y, z ) dV = fix o, y 0 , z 0 WiT) (V(7) = volume of T). 

T 

In this formula we interchange the two sides, divide by V(T ), and set f = div F. Then by 
the divergence theorem we obtain for the divergence an integral over the boundary surface 
5(7) of 7, 

(10) div F(*o, *» Zo) = Iff div F dV = //f-i. dA. 

We now choose a point P: (a‘ : , y x , z{) in T and let T shrink down onto P so that the 
maximum distance d(T) of the points of T from P goes to zero. Then Q: (a 0 , y 0 , zq) must 
approach P. Hence (10) becomes 


(ID 


div F(P) = lim —— 
d(r)—o V{T) 


ff F 

S(T) 


n dA. 


This proves 


THEOREM 2 


Invariance of the Divergence 

The divergence of a vector function F with continuous first partial derivatives in a 
region T is independent of the particular choice of Cartesian coordinates . For any 
P in T it is given by (11). 


Equation (1 1) is sometimes used as a definition of the divergence. Then the representation 
(1) in Cartesian coordinates can be derived from (11). 

Further applications of the divergence theorem follow in the problem set and in the 
next section. The examples in the next section will also shed further light on the nature 
of the divergence. 
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P R QMT W SET 10.7 


[us] APPLICATION OF TRIPLE INTEGRALS: 
MASS DISTRIBUTION 

Find the total mass of a mass distribution of density cr in a 
region T in space. (Show the details of your work.) 

1. cr = a 2 v 2 z 2 . T the box |a| ^ a , |y| ^ b y |z| ^ c 

2. o* = a 2 4- y 2 + z 2 , T the box 0 ^ .v ^ 4, 0 ^ y ^ 9. 
0 ^ z = 1 

3. o' = sin.v cosy. 7V 0 ^ a 5 ^7T. ^7T — .v = v ^ |tt, 

0 ^ z = 12 

4. a - T the tetrahedron with vertices (0. 0, 0), 

(2. 0. 0), (0, 2. 0), (0. 0, 2) 

5. a = §(a* + y 2 ) 2 , T the cylinder a* 2 + y 2 = 4, \z\ ~ 2 

6. cr = 30z, T the region in the First octant bounded by 
y = 1 — a* 2 and z = a*. Sketch it. 

7. cr = I 4- v + z 2 , T the cylinder v 2 + z 2 ^ 9, 

1 = A" = 9 

8. cr = a 2 + y 2 , 7 the ball a 2 + y 2 4* z 2 ^ « 2 


9-14 


APPLICATION OF TRIPLE INTEGRALS: 
MOMENT OF INERTIA 


I x = JJ j (v 2 4- z 2 ) dv dz of a mass of density 1 in 

T 

a region T about the A-axis. Find l x when T is as follows. 


9. The cube 0 ^ x ^ a, 0 £ y a, 0 £ z a 

10. The box O^x^a, -b!2 ^ y £ b/2, -c/2 z ^ c/2 

11. The cylinder y 2 4 - z 2 = a 2 , 0 x 2s /t 

12. The ball a- 2 + y 2 4 z 2 ^ « 2 

13. The cone y 2 + z 2 ^ a 2 . 0 ^ a ^ h 


14. The paraboloid y 2 + z 2 = a, 0 ^ a ^ /? 

1 f h 

15. Show that for a solid of revolution, I x = — w I r 4 (x) dx. 

Use this to solve Probs. 1 1-14. 0 

16. Why is I x in Prob. 13 for large h larger than l x in Prob. 
14? Why is it smaller for h = 1? Give physical reason. 


17-25 


APPLICATION OF THE DIVERGENCE 
THEOREM: 

SURFACE INTEGRALS I I F*n dA 


Evaluate this integral by the divergence theorem. (Show the 

details.) 

17. F = [a, y, z], S the sphere a 2 -I- y 2 + z 2 = 9 

18. F = [4a, 3z, 5y], S the surface of the cone 
a 2 + y 2 ^ z 2 , 0 k z ^ 2 

19. F = [z - y y 3 , 2z 3 ], S the surface of v 2 + z 2 = 4, 
— 3 = a = 3 ~ 

20. F = [3av 2 . va 2 - y 3 , 3za 2 ], S the surface of 
a 2 + y 2 5 25, 0 ^ z = 2 

21. F = [sinv. cos a, cosz], S the surface of 
a 2 4 y 2 ^ 4, |z| = 2 

22. F = [a 3 — y 3 , y 3 - z 3 . z 3 - a 3 ], S the surface of 
a 2 4- y 2 + z 2 ^ 25, z ^ 0 

23. F = [4a 2 , 2a 4- y 2 , a 2 4 z 2 ]. S the surface of the 
tetrahedron in Prob. 4 

24. F = [4a 2 , y 2 , -2 cos ttz], S the surface of the 

tetrahedron with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0), 
(0, 0, 1) 

25. F = [5a 3 , 5y 3 , 5z 3 ], S: a 2 + y 2 + z 2 = 4 


10.8 Further Applications of the 
Divergence Theorem 

We show in this section that the divergence theorem has basic applications in fluid flow, 
where it helps characterize sources and sinks of fluid, in heat flow , where it leads to the 
basic heat equation , and in potential theory , where it gives basic properties of the solutions 
of Laplace 's equation. Here the region T and its boundary surface S are assumed to be 
such that the divergence theorem applies. 

EXAMPLE 1 Fluid Flow. Physical Interpretation of the Divergence 

From the divergence theorem we may obtain an intuitive interpretation of the divergence of a vector. For this 
purpose wc consider the flow of an incompressible fluid (see Sec. 9.8) of constant density p = I which is steady, 
that is, does not vary with time. Such a flow is determined by the field of its velocity vector v(P) at any 
point P. 
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EXAMPLE 2 


Let 5 be the boundary surface of a region T in space, and let n be the outer unit normal vector of S. Then 
v n is the normal component of v in the direction of n, and |v • n dA\ is the mass of fluid leaving T (if v • n > 0 
at some P) or entering T (if v n < 0 at P) per unit time at some point P of S through a small portion AS of S 
of area A A. Hence the total mass of fluid that flows across S from T to the outside per unit time is given by the 
surface integral 

J j vndA. 
s 

Division by the volume V of T gives the average flow out of T: 

(i) "V If 

Since the flow is steady and the fluid is incompressible, the amount of fluid flowing outward must be continuously 
supplied. Hence, if the value of the integral (1) is different from zero, there must be sources ( positive sources 
and negative sources, called sinks) in T, that is, points where fluid is produced or disappears. 

If we let T shrink down to a fixed point P in T % we obtain from (1) the source intensity at P given by the 
right side of (1 1) in the last section with F*n replaced by v*n, that is. 


( 2 ) 


div v(P) = .Um_ — Jf 


dm— o V(T) 


vn dA. 


sen 


Hence the divergence of the velocity vector v of a steady incompressible flow is the source intensity of the flow 
at the corresponding point. 

There are no sources in T if and only if div v is zero everywhere in T. Then for any closed surface S in T we 
have 


II 


vndA = 0. 


Modeling of Heat Flow. Heat or Diffusion Equation 

Physical experiments show that in a body, heat flows in the direction of decreasing temperature, and the rate of 
flow is proportional to the gradient of the temperature. This means that the velocity v of the heat flow in a body 
is of the form 


(3) 


v = -K grad U 


where U(x, y, z, t) is temperature. / is time, and K is called the thermal conductivity of the body; in ordinary 
physical circumstances AT is a constant. Using this information, set up the mathematical model of heat flow, the 
so-called heat equation or diffusion equation. 

Solution . Let T be a region in the body bounded by a surface S with outer unit normal vector n such that 
the divergence theorem applies. Then v»n is the component of v in the direction of n, and the amount of heat 
leaving T per unit time is 


ii 


v»ndA. 

This expression is obtained similarly to the corresponding surface integral in the last example. Using 

div (grad U) = V 2 C/ =U XX + U yy + U a 
(the Laplacian; see (3) in Sec. 9.8), we have by the divergence theorem and (3) 

J JvndA = —K J J J div (grad U) dx dy dz 


< 4 ) 


-K j j Jv 2 lf dxdy dz. 
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EXAMPLE 3 


On the other hand, the total amount of heat H in T is 

f crpU dxdydz 


»-///• 


where the constant cr is the specific heat of the material of the body and p is the density (= mass per unit 
volume) of the material. Hence the time rate of decrease of H is 

dH Cff dU 

- ir = -jJJ ap ir dxdydz 

T 

and this must be equal to the above amount of heat leaving T. From (4) we thus have 

— JJ J ap dxdydz = —K JJ Jv 2 U dxdydz 


JJ J [o-p -JJ - KV 2 uj dxdydz = 0. 


Since this holds for any region T in the body, the integrand (if continuous) must be zero everywhere; that is, 


(5) 


— — = c 2 V 2 C/ 

dt 


c 2 = 


K_ 

ap 


where c 2 is called the thermal diffusivity of the material. This partial differential equation is called the heat 
equation. It is the fundamental equation for heat conduction. And our derivation is another impressive 
demonstration of the great importance of the divergence theorem. Methods for solving heat problems will be 
shown in Chap. 1 2. 

The heat equation is also called the diffusion equation because it also models diffusion processes of motions 
of molecules tending to level off differences in density or pressure in gases or liquids. 

If heat flow does not depend on time, it is called steady-state heat flow. Then dU/dt = 0, so that (5) reduces 
to Laplace’s equation V 2 U = 0. We met this equation in Secs. 9.7 and 9.8, and we shall now see that the 
divergence theorem adds basic insights into the nature of solutions of this equation. H 


Potential Theory. Harmonic Functions 

The theory of solutions of Laplace’s equation 


( 6 ) 


V 2 / = + 

} dx 2 


tL + tL 

dy 2 dz 2 


= 0 


is called potential theory. A solution of (6) with continuous second-order partial 
derivatives is called a harmonic function. That continuity is needed for application of 
the divergence theorem in potential theory, where the theorem plays a key role that we 
want to explore. Further details of potential theory follow in Chaps. 12 and 18. 


A Basic Property of Solutions of Laplace’s Equation 

The integrands in the divergence theorem are div F and F»n (Sec. 10.7). If F is the gradient of a scalar function, 
say, F = grad/, then divF = div (grad/) = V 2 /; see (3), Sec. 9.8. Also, F-n = n*F = n*grad/. This is 
the directional derivative of / in the outer normal direction of 5, the boundary surface of the region T in the 
theorem. This derivative is called the (outer) normal derivative of / and is denoted by df/dn. Thus the formula 
in the divergence theorem becomes 
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THEOREM 1 


EXAMPLE 4 


(7) 

T S 

This is the three-dimensional analog of (9) in Sec. 10.4. Because of the assumptions in the divergence theorem 
this gives the following result. H 


A Basic Property of Harmonic Functions 

Let f(x , 3\ z) be a harmonic function in some domain D is space. Lei S be any 
piecewise smooth closed orientable suiface in D whose entire region it encloses 
belongs to D. Then the integral of the normal derivative of f taken over S is zero. 
(For “piecewise smooth” see Sec. 10.5.) 


Green’s Theorems 

Let / and g be scalar functions such that F = / grad g satisfies the assumptions of the divergence theorem in 
some region T. Then 


div F = div (/ grad g ) 


— (['S-'S-'SI 

£♦'$)*(* *♦'$) 


= /V 2 g + grad / • grad g. 


Also, since / is a scalar function, 


= n*(/ grad g) 
= (n-grad g)f. 


Now n»grad g is the directional derivative dg/dn of g in the outer normal direction of 5. Hence the formula in 
the divergence theorem becomes “Green’s first formula” 


( 8 ) 


/ II^ 2g + grad/'grad #) dV = JJ f^ dA - 


Formula (8) together with the assumptions is known as the first form of Green's theorem . 
Interchanging / and g we obtain a similar formula. Subtracting this formula from (8) we find 




This formula is called Green’s second formula or (together with the assumptions) the second form of Green's 
theorem. U 
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EXAMPLE 5 


THEOREM 2 


THEOREM 3 


THEOREM 3* 


Uniqueness of Solutions of Laplace’s Equation 

Let / be harmonic in a domain D and let / be zero ever}' where on a piecewise smooth closed orientable surface 
S in D whose entire region T it encloses belongs to D. Then V 2 g is zero in T. and the surface integral in (8) is 
zero, so that (8) with g = / gives 


J J^grad / •grad f elV = jfj |grad /| 2 dV = 0. 

T T 

Since / is harmonic, grad / and thus |grad /| are continuous in T and on S, and since |grad /| is nonnegative, 
to make the integral over T zero, grad f must be the zero vector everywhere in T. Hence f x ~fy-fz~ 0» 
and f is constant in T and, because of continuity, it is equal to its value 0 on S. This proves the following 
theorem. 


Harmonic Functions 

Let /(a*, y, z) be harmonic in some domain D and zero at every point of a piecewise 
smooth closed orientable surface S in D whose entire region T it encloses belongs 
to D. Then f is identically zero in T. 


This theorem has an important consequence. Let f 1 and f 2 be functions that satisfy the assumptions of Theorem 
1 and take on the same values on S. Then their difference fi — / 2 satisfies those assumptions and has the value 
0 everywhere on S. Hence, Theorem 2 implies that 

fi - f 2 = 0 throughout 7, 

and we have the following fundamental result. 


Uniqueness Theorem for Laplace’s Equation 

Let T be a region that satisfies the assumptions of the divergence theorem , and let 
f{x, y, z ) be a harmonic function in a domain D that contains T and its boundary 
surface S. Then f is uniquely determined in T by its values on 5. 


The problem of determining a solution u of a partial differential equation in a region T such that tt assumes 
given values on the boundary surface S of T is called the Diriehlet problem. 8 We may thus reformulate Theorem 
3 as follows. 


Uniqueness Theorem for the Diriehlet Problem 

If the assumptions in Theorem 3 are satisfied and the Diriehlet problem, for the 
Laplace equation has a solution in T, then this solution is unique . 


These theorems demonstrate the extreme importance of the divergence theorem in potential theory. 9 


8 PETER GUSTAV LEJEUNE DIRICHLET (1805-1859), German mathematician, studied in Paris under 
Cauchy and others and succeeded Gauss at Gottingen in 1855. He became known by his important research on 
Fourier series (he knew Fourier personally) and in number theory. 
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— — — If 9 mm* 


1. (Harmonic functions) Verify Theorem 1 for 

f = 2 a - 2 4- 2y 2 - 4z 2 and S the surface of the cube 
OgjtSl.Ogygl.OSzgl. 

2. (Harmonic functions) Verify Theorem 1 for 
/ = y 2 - a 2 and the surface of the cylinder 
a 2 + y 2 s 1, 0 s z = 5. 

3. (Green’s first formula) Verify (8) for f = 3y 2 , 
g = a 2 , S the surface of the cube in Prob. 1 . 

4. (Green’s first formula) Verify (8) for f = a, 

g = y 2 4- z 2 , 5 the surface of the box 0 ^ a ^ 1, 
0SyS2,0gzS3. 

5. (Green’s second formula) Verify (9) for the data in 
Prob. 3. 

6. (Green’s second formula) Verify (9) for / - a 4 , 
g = y 2 and the cube in Prob. 1. 

7. (Volume as a surface integral) Show that a region T 
with boundary surface S has the volume 

V = ~r \\r cos <£ dA 
3 J S J 

where r is the distance of a variable point P: (a, 3', z) 
on 5 from the origin O and <t> is the angle between the 
directed line OP and the outer normal of 5 at P. (Make 
a sketch.) 

8 . Find the volume of a ball of radius a by means of the 
formula in Prob. 7. 

9. Show that a region T with boundary surface 5 has the 
volume 

V = JJxdydz 
s 

-\\ydzd* 



10. TEAM PROJECT. Divergence Theorem and 
Potential Theory. The importance of the divergence 
theorem in potential theory is obvious from (7) —(9) 
and Theorems 1 —3. To emphasize it further, consider 
functions f and g that are harmonic in some domain D 
containing a region T with boundary surface S such that 
T satisfies the assumptions in the divergence theorem. 
Prove and illustrate by examples that then: 

(a) Jjg^dA = J j j |grad g| 2 dV. 

S T 

(b) If dgfdn = 0 on S, then g is constant in T. 



(d) If dfldn - dg/dn on S, then / = g + c in 7, where 
c is a constant. 

(e) The Laplacian can be represented independently 
of coordinate systems in the form 

V 2 / = Jim — f f dA 

d(2>-0 V(T) JJdn 

S(T) 

where d{T) is the maximum distance of the points of a 
region T bounded by 5(7) from the point at which the 
Laplacian is evaluated and V(T) is the volume of T. 


10.S Stokes’s Theorem 


Having seen the great usefulness of Gauss’s divergence theorem, we now turn to the 
second “big” theorem in this chapter, Stokes’s theorem. This theorem transforms line 
integrals into surface integrals and conversely. Hence it generalizes Green’s theorem of 
Sec. 10.4. Stokes’s theorem involves the curl 


( 1 ) 


curlF = 


i 

d/dx 

F 1 


j 

d/dy 
F 2 


k 

d/dz 

F 3 I 


(see Sec. 9.9). 
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THEOREM 1 


EXAMPLE 1 


Stokes’s Theorem 9 

(Transformation Between Surface and Line Integrals) 

Let S be a piecewise smooth i 9 oriented surface in space and let the boundary of S 
be a piecewise smooth simple closed curve C. Let F(x, y, z) be a continuous vector 
function that has continuous first partial derivatives in a domain in space containing 
S. Then 

(2) J j (curlF) # n<A4 = F«r f (s)ds. 

s c 

Here n is a unit normal vector of S and t depending on n, the integration around C 
is taken in the sense shown in Fig. 251. Furthermore , r' = dr Ids is the unit tangent 
vector and s the arc length of C. 

In components , formula (2) becomes 



= £(Fi dx + F z dy + F 3 dz ). 


Here, F = [F l9 F 2 , F 3 ], N = [N lt N 2 , N 3 ], ndA = Ndudv, 
r ds = [dx, dy ; dz], and R is the region with boundary curve C in the uu-plane 
corresponding to S represented by r (u, v). 


The proof follows after Example 1. 



Fig. 251. Stokes's theorem 



Fig. 252. Surface S in Example 1 


Verification of Stokes’s Theorem 

Before we prove Stokes’s theorem, let us first get used to it by verifying it for F = [y, z, x] and S the paraboloid 
(Fig. 252) 

z = /(*, y)= 1 - (* 2 + y\ z ^ 0. 


Solution . The curve C, oriented as in Fig. 252, is the circle r(j) = [cos s , sin s, 0]. Its unit tangent vector 

is r'($) = [—sin s , cos s , 0]. The function F = [y, z> x] on C is F(r(s)) = [sin s , 0, cos s]. Hence 


<f> F* dr = J F(r(.v))*r'(.s) ds = I [(sin ^)( — sin + 0 + 0] ds = -it. 

J n -'n 


9 Sir GEORGE GABRIEL STOKES (1819-1903), Irish mathematician and physicist who became a professor 
in Cambridge in 1849. He is also known for his important contribution to the theory of infinite series and to 
viscous flow (Navier-Stokes equations), geodesy, and optics. 

“Piecewise smooth” curves and surfaces are defined in Secs. 10.1 and 10.5. 
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We now consider the surface integral. We have F x = v, F 2 = z, F 3 = a\ so that in (2*) we obtain 

curl F = curl [F x , F 2 , F 3 ] = curl [y, z, jc] = [-1, -1, -11. 

A normal vector of S is N = grad (z - /(.v, y)) = [2a-, 2y, 1]. Hence (curl F)*N = -2v - 2y - 1. Now 
n dA = N dx dy (see (3*) in Sec. 10.6 with a*, y instead of //, u). Using polar coordinates r. 0 defined by 
x - r cos 0, y = r sin 0 and denoting the projection of S into the .vy-plane by ft, we thus obtain 


J J*(curl F) • n clA = J J(curl F) • N dx dy = J J(-2x - 2y - 1 ) dx dy 


R 

J2.TT A 


= J J (-2r(cos 6 4- sin 0) - l )rdrdO 

0=0 r =0 


= J (cos 0 + sin 0) ~ ^ dd = 


0 + 0 - - (277) = 


PROOF We prove Stokes’s theorem. Obviously, (2) holds if the integrals of each component on 
both sides of (2*) are equal; that is, 



We prove this first for a surface S that can be represented simultaneously in the forms 
(6) (a) z = /(*, y), (b) y = g(x, z ), (c) x = h(y\ z). 

We prove (3), using (6a). Setting it = jc, v = y, we have from (6a) 

r(w, v) = r(x, y) = [a% y, /(a*, y)] = xi 4- yj + /k 
and in (2), Sec. 10.6, by direct calculation 


N = r M x r, = r* x = [-/*, 1] = -/«! - f y j + k. 


Note that N is an upper normal vector of S, since it has a positive ^-component. Also, 
R = S*, the projection of S into the xy-plane, with boundary curve C = C* (Fig. 253). 
Hence the left side of (3) is 


(7) 



dFi 

d y . 


dx dy . 


We now consider the right side of (3). We transform this line integral over C = C* into 
a double integral over S* by applying Green’s theorem [formula (1) in Sec. 10.4 with 
F z = 0]. This gives 



dF x 

dx dy. 

dy 



SEC. 10.9 Stokes’s Theorem 


471 


EXAMPLE 2 


EXAMPLE 3 



Fig. 253. Proof of Stokes’s theorem 


Here, F x = F t (x 9 y , f(x 9 y )). Hence by the chain rule (see also Prob. 

_ dF x (x y y, /(,y, y)) _ _ dF^x, y, z) _ c )F x (x 9 y 9 z) 
dy dy dz 


10 in Problem Set 9.6), 
df 

— [z = f(x, 3 ’)]- 
ay 


We see that the right side of this equals the integrand in (7). This proves (3). Relations 
(4) and (5) follow in the same way if we use (6b) and (6c), respectively. By addition we 
obtain (2*). This proves Stokes’s theorem for a surface S that can be represented 
simultaneously in the forms (6a), (6b), (6c), 

As in the proof of the divergence theorem, our result may be immediately extended to 
a surface S that can be decomposed into finitely many pieces, each of which is of the kind 
just considered. This covers most of the cases of practical interest. The proof in the case 
of a most general surface S satisfying the assumptions of the theorem would require a limit 
process; this is similar to the situation in the case of Green’s theorem in Sec. 10.4. ■ 

Green’s Theorem in the Plane as a Special Case of Stokes’s Theorem 

Let F = [F lt F 2 ] = Fi i + F 2 j be a vector function that is continuously differentiable in a domain in the 
Av-plane containing a simply connected bounded closed region S whose boundary C is a piecewise smooth 
simple closed curve. Then, according to ( 1 ), 

dF z dF i 

(curl F)*n = (curl F)*k = — — — . 

dv dy 

Hence the formula in Stokes’s theorem now takes the form 

This shows that Green’s theorem in the plane (Sec. 10.4) is a special case of Stokes’s theorem (which we needed 
in the proof of the latter!). I 

Evaluation of a Line Integral by Stokes’s Theorem 

Evaluate f c F* r ds, where C is the circle x 2 + y 2 = 4, z = “3, oriented counterclockwise as seen by a person 
standing at the origin, and, with respect to right-handed Cartesian coordinates, 

F = [y, xz 3 , -;v 3 ] = vi + .vz 3 j - z/k. 

Solution. As a surface 5 bounded by C we can take the plane circular disk .v 2 + _v 2 = 4 in the plane : = -3. 
Then n in Stokes’s theorem points in the positive ^-direction; thus n = k. Hence (curlF)*n is simply the 
component of curl F in the positive c-direction. Since F with z = — 3 has the components F x = y, F z = -27.v, 
F 3 = 3v 3 . we thus obtain 

dF 2 dF l 

(curl F) • n = — — = -27 - I = -28. 

dx dy 
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EXAMPLE 4 



Fig. 254. Example 4 


EXAMPLE 5 


Hence the integral over S in Stokes’s theorem equals —28 times the area 47rof the disk S. This yields the answer 
-28 *477= — 1 12 tt —352. Confirm this by direct calculation, which involves somewhat more work. ■ 


Physical Meaning of the Curl in Fluid Motion. Circulation 

Let S r be a circular disk of radius r 0 and center P bounded by the circle C rQ (Fig. 254). and let 
F(0 = F(.v. v. ~) be a continuously differentiable vector function in a domain containing S rQ . Then by Stokes’s 
theorem and the mean value theorem for surface integrals (see Sec. 10.6), 


F*r' ds = J^j*(curl F)«n dA — (curl F)*n(P*)A ro 
C 'o s r 0 

where A Vq is the area of S r and P* is a suitable point of S Tq . This may be written in the form 

(curl F)*n(P*) = <P F*r ' ds. 

Ar o 

In the case of a fluid motion with velocity vector F = v, the integral 


t: 


r' ds 


is called the circulation of the How around C rQ . It measures the extent to which the corresponding fluid motion 
is a rotation around the circle C Ty . If we now let r 0 approach zero, we find 


( 8 ) 


(curl v)*n(P) = lim 

o 


I 



that is, the component of the curl in the positive normal direction can be regarded as the specific circulation 
(circulation per unit area) of the flow in the surface at the corresponding point. H 


Work Done in the Displacement around a Closed Curve 

Find the work done by the force F = Zvy 3 sin z i + 3.v 2 y 2 sin - j + ,v 2 y 3 cos z k in the displacement around the 
curve of intersection of the paraboloid z — x 2 + y 2 and the cylinder ( x - l) 2 + y 2 = 1. 

Solution . This work is given by the line integral in Stokes’s theorem. Now F = grad /, where / = .v 2 y 3 sin z 
and curl (grad /) = 0 (see (2) in Sec. 9.9), so that (curl F)*n = 0 and the work is 0 by Stokes’s theorem. This 
agrees with the fact that the present field is conservative (definition in Sec. 9.7). H 


Stokes's Theorem Applied to Path Independence 

We emphasized in Sec. 10.2 that the value of a line integral generally depends not only 
on the function to be integrated and on the two endpoints A and B of the path of integration 
C, but also on the particular choice of a path from A to B. In Theorem 3 of Sec. 10.2 we 
proved that if a line integral 

(9) f F(rWr = f (F 1 dx + F 2 dy + F 3 dz) 

J c J c 

(involving continuous F 1? F 2 , F 3 that have continuous first partial derivatives) is path 
independent in a domain D ; then curlF = 0 in D. And we claimed in Sec. 10.2 that, 
conversely, curl F = 0 everywhere in D implies path independence of (9) in D provided 
D is simply connected. A proof of this needs Stokes’s theorem and can now be given as 
follows. 

Let C be any closed path in D. Since D is simply connected, we can find a surface S 
in D bounded by C. Stokes’s theorem applies and gives 

^ (Fj dx + F 2 dy + F 3 dz) = Ft' ds = j J (curl F)*n dA 
c c 
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for proper direction on C and normal vector n on S. Since curl F = 0 in D, the surface 
integral and hence the line integral are zero. This and Theorem 2 of Sec. 10.2 imply that 
the integral (9) is path independent in D. This completes the proof. ■ 




1-8 


DIRECT INTEGRATION OF THE SURFACE 
INTEGRALS 


Evaluate the integral 
F and 5. 


If< 


(curl F) 


n dA directly for the given 


1. F = [4z 2 , 16 a, 0], S: z = y (0 S a S 1, 0 S y 5 1) 

2. F = [0, 0, 5 a cos z]j 

S: a 2 + y 2 = 4, y S 0, 0 S z g 

3. F = [-<?», <?*, e x \ 

S: z = * + y (0 S x S 1, 0 S y S 1) 

4 . F = [3 cos y, cosh z, a - ], 

S the square 0 S r = 2, 0 = y S 2, z = 4 

5. F = [e 2z , e z sin y, e z cos >], 

S: z = /(0SrS4,0SySl) 

6. F = [z 2 , a 2 , v 2 ], S: z 2 = x 2 + v 2 , y 6 0, 0 S z g 2 

7. F = [z 2 , §a, 0], 

S the square 0 a ^ < 2 , 0 y ^ a, z = 1 

8. F = [y 3 4 5 6 7 , -a 3 , 0], S: a 2 + y 2 S I, z = 0 


9. Verify Stokes’s theorem for F and S in Prob. 7. 

10. Verify Stokes’s theorem for F and S in Prob. 8. 

1 1 1— 18| EVALUATION OF j> F« t' ds 

Calculate this line integral by Stokes’s theorem, clockwise 
as seen by a person standing at the origin, for the following 
F and C. Assume the Cartesian coordinates to be right- 
handed. (Show die details.) 


11. F = [~3y, 3a, z], C the circle x 2 + y 2 = 4, z = 1 

12. F = [4 z, -2a*, 2a], 

C the intersection of a* 2 + y 2 = L and : = v + 1 

13. F = [y 2 , a* 2 , —.v + z]y around the triangle with 

vertices (0, 0, L), (1, 0, 1), (1, 1, 1) 

14. F = [y. Ay 3 , -zy 3 ], 

C the circle a 2 + y 2 = a 2 , z = b (> 0) 

15. F = [y, z 2 , a 3 ], C as in Prob. 12 

16. F = [a 2 , y 2 , z 2 ], 

C the intersection of a 2 + y 2 + z 2 = 4 and z = y 2 

17. F = [cos uy, sin 7ta, 0], around the rectangle with 
vertices (0, 1, 0), (0, 0, 1), (1, 0, 1), (1, 1, 0) 

18. F = [z, a, y], C as in Prob. 13 

19. (Stokes’s theorem not applicable) Evaluate <P F • r 1 ds, 

J c 

F = (a 2 + y 2 )" 1 [— y, a], C: a 2 + y 2 = 1 , z = 0, oriented 
clockwise. Why can Stokes’s theorem not be applied? 
What (false) result would it give? 

20. WRITING PROJECT. Grad, Div, Curl in 
Connection with Integrals. Make a list of ideas and 
results on this topic in this chapter. See whether you 
can rearrange or combine parts of your material. Then 
subdivide the material into 3-5 portions and work out 
the details of each portion. Include no proofs but simple 
typical examples of your own that lead to a better 
understanding of the material. 




STIONS AND PROBLEMS 


1. List the kinds of integrals in this chapter and how the 
integral theorems relate some of them. 

2. How can work of a variable force be expressed by an 
integral? 

3. State from memory how you can evaluate a line integral. 
A double integral. 

4. What do you remember about path independence? Why 
is it important? 

5. How did we use Stokes’s theorem in connection with 
path independence? 

6. State the definition of curl. Why is it important in this 
chapter? 

7. How can you transform a double integral or a surface 
integral into a line integral? 


8. What is orientation of a surface? What is its role in 
connection with surface integrals? 

9. State the divergence theorem and its applications from 
memory. 

10. State Laplace’s equation. Where in physics is it 
important? What properties of its solutions did we 
discuss? 


11-20 


LINE INTEGRALS j F (r)-dr 
(WORK INTEGRALS/ 0 


Evaluate, with F and C as given, by the method that seems 
most suitable. Recall that if F is a force, the integral gives 
the work done in a displacement along C. (Show the details.) 
11. F = [a 2 , y 2 , z 2 ], 

C the straight-line segment from (4, 1 , 8) to (0, 2, 3) 
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12. F = [cosz, —sinz, —.v sinz - y cosz], C the 
straight-line segment from (-2. 0, |?r) to (4. 3, 0) 

13. F = [.v - y, 0, e*], 

C; y = 3a 2 , z = 2 a for .v from 0 to 2 

14. F = [vz, 2 za, xy ], 

C the circle a 2 + y 2 = 9, z = 1, counterclockwise 

15. F = [-3y 3 , 3 a 3 + cosy, 0], 

C the circle a 2 -I- y 2 = 16, z = 0, counterclockwise 

16. F = [sin 77T, cos tta, sin tta], 

C the boundary of 0 = a = 1/2, 0 ^ y ^ 2, z = 2x 

17. F = [9 z, 5a, 3y], 

C the ellipse a 2 + v 2 = 9, z = a + 2 

18. F = [cosh a, e 4 " tan z]. C: a 2 + y 2 = 4, z = a 2 . 
(Sketch C.) 

19. F = [z 2 , a 3 , y 2 ], C; a 2 + y 2 = 4, a + y + z = 0 

20. F = [a 2 , y 2 , y 2 v], C the helix 

r = [2 cos /, 2 sin /, 6f] from (2, 0, 0) to (0, 2, 3 tt) 


21-25 


DOUBLE INTEGRALS, 
CENTER OF GRAVITY 


Find the coordinates a. y of the center of gravity of a mass 
of density /(a, y) in the region R. (Sketch R. Show the 
details.) 

21. / = 2av, R the triangle with vertices (0, 0), (1, 0), 

(M) 

22. / = 1, R: 0 ^ y ^ 1 — a 2 

23. / = 1, R: a 2 + y 2 ^ a 2 , y ^ 0 


24. / = a 2 + y 2 , R: a 2 + y 2 ^ 1, a ^ 0, y ^ 0 

25. / = 2a 2 , R the region below y = a + 2 and above 


26-35 


SURFACE INTEGRALS JJ F-n dA 


Evaluate this integral directly or, if possible, by the 
divergence theorem. (Show the details.) 

26. F = [2.v 2 , 4 y, 0], 


S: x + y + z = 1, a £ 0,)’ § 0, z § 0 

27. F = [y, -v. 0], 

S: 3.v + 2y + z = 6, .v § 0, y g 0, z S 0 

28. F = [a- - y. y - z, z - a], 

S the sphere of radius 5 and center 0 

29. F = [y 2 , a 2 , z 2 ], 

S the surface of a 2 + y 2 = 4, 0 = z = 5 

30. F = [y 3 , a 3 , 3z 2 ], 

S the portion of the paraboloid z = a 2 + y 2 , z = 4 

31. F = [sin 2 a, — y sin 2 a, 5z], 

S the surface of the box |x| = a, |y| = b, |z| = c 

32. F = [l, I, a], S: a 2 + y 2 + 4z 2 = 4, j g 0 

33. F = [a, Ay, z], S: x 2 + y 2 = I, 0 S z S h 


34. F as in Prob. 33, S the complete boundary of 
a 2 + y 2 Sl,0SzS/i 

35. F = [e", 0, ze*], S the rectangle with vertices (0, 0, 0), 
(1, 2, 0), (0, 0, 5), (1, 2, 5) 


-SUMMARY OF: CHAPTER 10. 

Vector Integral Calculus. Integral Theorems 


Chapter 9 extended differential calculus to vectors, that is, to vector functions 
v(a\ y. z) or v(f). Similarly, Chapter 10 extends integral calculus to vector functions. 
This involves line integrals (Sec. 10.1), double integrals (Sec. 10.3), siuface 
integrals (Sec. 10.6), and triple integrals (Sec. 10.7) and the three “big” theorems 
for transforming these integrals into one another, the theorems of Green (Sec. 10.4), 
Gauss (Sec. 10.7), and Stokes (Sec. 10.9). 

The analog of the definite integral of calculus is the line integral (Sec. 10.1) 

(1) f F(r)*dr = f (F x dx + F 2 dy + F z dz) = f F(r (/))• ^ dt 

J C J C J a dt 

where C: r(f) = [a(/), y(t), z(0] = x(t)i + y(t) j + z(/)k (a ^ ^ b) is a curve in 
space (or in the plane). Physically, (1) may represent the work done by a (variable) 
force in a displacement. Other kinds of line integrals and their applications are also 
discussed in Sec. 10.1. 
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Independence of path of a line integral in a domain D means that the integral 
of a given function over any path C with endpoints P and Q has the same value for 
all paths from P to Q that lie in D; here P and Q are fixed. An integral (1) is 
independent of path in D if and only if the differential form F t dx + F 2 dy + F 3 dz 
with continuous F lt F 2 , F 3 is exact in D (Sec. 10.2). Also, if curl F = 0, where 
F = [Fi, F 2 , F 3 ], has continuous first partial derivatives in a simply connected 
domain D, then the integral (1) is independent of path in D (Sec. 10.2). 

Integral Theorems. The formula of Green’s theorem in the plane (Sec. 10.4) 

<2) 

transforms double integrals over a region R in the xy-plane into line integrals over 
the boundary curve C of R and conversely. For other forms of (2) see Sec. 10.4. 
Similarly, the formula of the divergence theorem of Gauss (Sec. 10.7) 


(3) JJJdiv F cIV = ff F*n clA 

T S 

transforms triple integrals over a region T in space into surface integrals over the 
boundary surface S of 7, and conversely. Formula (3) implies Green’s formulas 

(4) ffl (fV 2 g + V/- Vg)clV= Jffy- dA, 

T S 

(5) J/J(/V 2 , - gV 2 f) dV = //(/ ^ - g dA. 

Finally, the formula of Stokes’s theorem (Sec. 10.9) 

(6) J J(curl F)*n dA — <j> Ft \s) ds 


transforms surface integrals over a surface S into line integrals over the boundary 
curve C of S and conversely. 






PART C 

Fourier Analysis. 
Partial 
Differential 
Equations 


CHAPTER 11 Fourier Series, Integrals, and T ransforms 
CHAPTER 12 Partial Differential Equations (PDEs) 

Fourier analysis concerns periodic phenomena, as they occur quite frequently in 
engineering and elsewhere — think of rotating parts of machines, alternating electric 
currents, or the motion of planets. Related periodic functions may be complicated. This 
situation poses the important practical task of representing these complicated functions in 
terms of simple periodic functions, namely, cosines and sines. These representations will 
be infinite series, called Fourier series . 1 

The creation of these series was one of the most path-breaking events in applied 
mathematics, and we mention that it also had considerable influence on mathematics as 
a whole, on the concept of a function, on integration theory, on convergence theory for 
series, and so on (see Ref. [GR7] in App. 1). 

Chapter 1 1 is concerned mainly with Fourier series. However, the underlying ideas can 
also be extended to nonperiodic phenomena. This leads to Fourier integrals and 
transforms . A common name for the whole area is Fourier analysis. 

Chapter 12 deals with the most important partial differential equations (PDEs) of physics 
and engineering. This is the area in which Fourier analysis has its most basic applications, 
related to boundary and initial value problems of mechanics, heat flow, electrostatics, and 
other fields. 


l JEAN-BAPTISTE JOSEPH FOURIER (1768-1830), French physicisi and mathematician, lived and taught 
in Paris, accompanied Napoleon in the Egyptian War. and was later made prefect of Grenoble. The beginnings 
on Fourier series can be found in works by Euler and by Daniel Bernoulli, but it was Fourier who employed 
them in a systematic and general manner in his main work, Theorie analytiqite de la chalettr ( Analytic Theory 
of Heat, Paris, 1822), in which he developed the theory of heat conduction (heat equation; see Sec. 12.5). making 
these scries a most important tool in applied mathematics. 
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CHAPTER 1 1 


Fourier Series, Integrals, 
and Transforms 

Fourier series (Sec. 11.1) are infinite series designed to represent general periodic 
functions in terms of simple ones, namely, cosines and sines. They constitute a very 
important tool, in particular in solving problems that involve ODEs and PDEs. 

In this chapter we discuss Fourier series and their engineering use from a practical point 
of view, in connection with ODEs and with the approximation of periodic functions. 
Application to PDEs follows in Chap. 12. 

The theory of Fourier series is complicated, but we shall see that the application of these 
series is rather simple. Fourier series are in a certain sense more universal than the familiar 
Taylor series in calculus because many discontinuous periodic functions of practical interest 
can be developed in Fourier series but, of course, do not have Taylor series representations. 

In the last sections (1 1 .7-1 1 .9) we consider Fourier integrals and Fourier transforms, 
which extend the ideas and techniques of Fourier series to nonperiodic functions and have 
basic applications to PDEs (to be shown in the next chapter). 

Prerequisite: Elementary integral calculus (needed for Fourier coefficients) 

Sections that may be omitted in a shorter course: 11.4-11.9 

References and Answers to Problems: App. 1 Part C, App. 2. 


n.i Fourier Series 



Fourier series are the basic tool for representing periodic functions, which play an 
important role in applications. A function /( a) is called a periodic function if /( a) is 
defined for all real a (perhaps except at some points, such as a* = ±7t/2, ±37t/2, • • * for 
tan x) and if there is some positive number p , called a period of /(a), such that 

(1) /(a + p) = /(a) for all a. 

The graph of such a function is obtained by periodic repetition of its graph in any interval 
of length p (Fig. 255). 

Familiar periodic functions are the cosine and sine functions. Examples of functions 
that are not periodic are a, a 2 , a 3 , e x , cosh a, and In a, to mention just a few. 

If fix) has period /?, it also has the period 2 p because (1) implies 
fix + 2 p) = /([ a +p]+p) = fix +p) = /(a), etc.; thus for any integer n = 1 , 2, 3, • • • , 

(2) fix + np) = fix) for all a. 
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Furthermore if /( x) and g( x) have period p, then af(x) -I- bg(x) with any constants a and 
b also has the period p. 

Our problem in the first few sections of this chapter will be the representation of various 
functions f(x) of period 2tt in terms of the simple functions 

(3) l, cos x , sin x , cos 2x , sin 2x, * • • , cos nx , sin nx 9 • • • . 

All these functions have the period 2tt. They form the so-called trigonometric system. Figure 
256 shows the fust few of them (except for the constant 1, which is periodic with any period). 
The series to be obtained will be a trigonometric series, that is, a series of the form 

a 0 4- a x cos x + b x sin x + a 2 cos 2x + b 2 sin 2x + • • • 

oo 

' ' = a 0 4- 2 ( a n cos fVC + bn nx )- 

n= 1 

a 0 , a Xi b ly a 2 , b 2i • • • are constants, called the coefficients of the series. We see that each 
term has the period 2tt. Hence if the coefficients are such that the series converges , its 
sum will be a function of period 2i t. 

It can be shown that if the series on the left side of (4) converges, then inserting 
parentheses on the right gives a series that converges and has the same sum as the series 
on the left. This justifies the equality in (4). 

Now suppose that f(x) is a given function of period 2 tt and is such that it can be 
represented by a series (4), that is, (4) converges and, moreover, has the sum f(x). Then, 
using the equality sign, we write 


oo 

(5) f(x) = a 0 + 2 ( a n cos nx + b n sin nx) 

n-1 



cosx 



cos 2x cos 3x 



sin 2x 


Fig. 256. Cosine and sine functions having the period 27T 


sin x 


sin 3x 
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EXAMPLE 1 


and call (5) the Fourier series of f(x). We shall prove that in this case the coefficients 
of (5) are the so-called Fourier coefficients of f(x), given by the Euler formulas 


(a) 

«o = 

1 

2 ir 

f fix) dx 

— 7T 




1 | 

r 


(b) 

a n = 

7T J 

1 f(x) cos nx dx 

— 7T 

n = 1,2, 

(c) 

K = 

7 T J 

\ f(x) sin nx dx 

— 7T 

n = 1, 2, 


The name “Fourier series” is sometimes also used in the exceptional case that (5) with 
coefficients (6) does not converge or does not have the sum f(x) — this may happen but 
is merely of theoretical interest. (For Euler see footnote 4 in Sec. 2.5.) 

A Basic Example 

Before we derive the Euler formulas (6), let us become familiar with the application of 
(5) and (6) in the case of an important example. Since your work for other functions will 
be quite similar, try to fully understand every detail of the integrations, which because of 
the n involved differ somewhat from what you have practiced in calculus. Do not just 
routinely use your software, but make observations: How are continuous functions (cosines 
and sines) able to represent a given discontinuous function? How does the quality of the 
approximation increase if you take more and more terms of the series? Why are the 
approximating functions, called the partial sums of the series, always zero at 0 and tt? 
Why is the factor 1 In (obtained in the integration) important? 

Periodic Rectangular Wave (Fig. 257a) 

Find the Fourier coefficients of the periodic function /( x) in Fig. 257a. The formula is 

f-k if -7r<.v<0 

(7) fix) = \ and /(.v + 2tt) = fix). 

t k if 0 < x < 7r 

Functions of this kind occur as external forces acting on mechanical systems, electromotive forces in electric 
circuits, etc. (The value of f(x) at a single point does not affect the integral: hence we can leave fix) undefined 
at x = 0 and x = ±7r.) 

Solution . From (6a) we obtain a 0 = 0. This can also be seen without integration, since the area under the 
curve of f(x) between - and it is zero. From (6b), 



because sin nx = 0 at — 7r, 0. and tt for all n = I, 2, • • • . Similarly, from (6c) we obtain 
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Since the a n are zero, the Fourier series of f(x ) is 
4 k l I 

(8) — I sin jc -I- *“ sin 3x 

7T \ 3 

The partial sums are 

4 k 

Si = — sin .v, S 2 - 

7 T 

Their graphs in Fig. 257 seem to indicate that the series is convergent and has the sum /( jc), the given function. 
We notice that at x = 0 and x = ir y the points of discontinuity of /(jc), all partial sums have the value zero, the 
arithmetic mean of the limits —k and k of our function, at these points. 

Furthermore, assuming that f(x) is the sum of the series and setting x = tt/ 2, we have 



This is a famous result obtained by Leibniz in 1673 from geometric considerations. It illustrates that the values 
of various series with constant terms can be obtained by evaluating Fourier series at specific points. M 

Derivation of the Euler Formulas (6) 

The key to the Euler formulas (6) is the orthogonality of (3), a concept of basic importance, 
as follows. 


+ — sin 5x + 




4k ( . I \ 

— ( sin x + — sin 3jc I , 
77 \ 3 }' 


etc., 


THEOREM 1 


Orthogonality of the Trigonometric System (3) 

The trigonometric system (3) is orthogonal on the interval — 7r ^ x = it (hence also 
on 0 = x = 27 t or any other interval of length 2 tt because of periodicity); that is , 
the integral of the product of any two functions in (3) over that interval is 0, so that 
for any integers n and m, 


(a) 

J cos nx cos mx dx = 0 

— 7T 

( n m) 




(b) 

I sin nx sin mx dx = 0 

“77 

( n =£ m) 




(c) 

J sin nx cos mx dx = 0 

(n =A m or n = m). 


PROOF This follows simply by transforming the integrands trigonometrically from products into 
sums. In (9a) and (9b), by (11) in App. A3.1, 


C 1 r 1 r 7r 

I cos nx cos mx dx = — - I cos (n + m)x dx + — I cos (n - m)x dx 

— IT 2 77- 2 — 7T 

r . . .if* i r* 

I sin nx sin mx dx = — J cos (n - m)x dx - — J cos (n + m)x dx. 

— TT ^ — 7T 2 J — tt 
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Since m # n (integer!), the integrals on the right are all 0. Similarly, in (9c), for all integer 
m and n (without exception; do you see why?) 

7T If 77 . 1 C 

I sin nx cos mx dx = — J sin (n + m)x dx + — J sin (n — m)x dx — 0 + 0. ■ 

— 7T 2 77- ^ —TT 


Application of Theorem 1 to the Fourier Series (5) 

We prove (6a). Integrating on both sides of (5) from —tt to tt, we get 


f/wW 


CO 

a 0 + X ( a n cos nx -h sin nx) 

n— 1 


dx. 


We now assume that termwise integration is allowed. (We shall say in the proof of 
Theorem 2 when this is true.) Then we obtain 


I /(x) dx = a 0 dx + 2 ( 0 n I cos nx dx + I sin nx dx I . 

77" —TT n=l ' — «■ / 


The first term on the right equals 27nz 0 . Integration shows that all the other integrals are 
0. Hence division by 27 t gives (6a). 


We prove (6b). Multiplying (5) on both sides by cos mx with my fixed positive integer 
m and integrating from — tt to i r, we have 


(10) J f(x) cos mx dx = J 


oo ~ 

Oq + 2 ( a n cos nx + sin nx) cos mx dx. 
n — 1 


We now integrate term by term. Then on the right we obtain an integral of a 0 cos mx, 
which is 0; an integral of a n cos nx cos mx, which is a m ir for n = m and 0 for n =£ m by 
(9a); and an integral of b n sin nx cos mx, which is 0 for all n and m by (9c). Hence the 
right side of (10) equals c^tt. Division by tt gives (6b) (with m instead of n). 


We finally prove (6c). Multiplying (5) on both sides by sin mx with my fixed positive 
integer m and integrating from —it to it, we get 


( 11 ) 


J f(x) sin mx dx 

— TT 



OO 

a 0 + X ( a n cos nx *f b n sin nx) 

n= 1 




sin /nx dx. 


Integrating term by term, we obtain on the right an integral of a 0 sin mx, which is 0; an 
integral of a n cos nx sin mx, which is 0 by (9c); and an integral of b n sin nx sin mx, which 
is b m 7r if n-m and 0 if n ^ m, by (9b). This implies (6c) (with n denoted by m). This 
completes the proof of the Euler formulas (6) for the Fourier coefficients. ■ 
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Convergence and Sum of a Fourier Series 

The class of functions that can be represented by Fourier series is surprisingly large and 
general. Sufficient conditions valid in most applications are as follows. 


THEOREM 2 


Representation by a Fourier Series 

Let fix) be periodic with period 2i t and piecewise continuous (see Sec. 6.1) in the 
interval — 7r^x ^ tt. Furthermore , let fix) have a left-hand derivative and a 
right-hand derivative at each point of that interval Then the Fourier series (5) of 
f{x) [with coefficients ( 6 )] converges. Its sum is fix), except at points x 0 where f(x) 
is discontinuous . There the sum of the series is the average of the left - and 
right-hand limits 2 of f(x) at x 0 . 


PROOF We prove convergence in Theorem 2. We prove convergence for a continuous function 
fix) having continuous first and second derivatives. Integrating ( 6 b) by parts, we obtain. 


1 r 

a n = — I fix) cos nx dx = 

77 J 


fix) sin nx 


flTT 


J17T J 


sin nx dx. 


117T 

The first term on the right is zero. Another integration by parts gives 

cos nx dx. 




fix) cos nx 


n 2 n 


I r 71 

— I f'M 

nTT -L, 


The first term on the right is zero because of the periodicity and continuity of f'ix). Since 
f" is continuous in the interval of integration, we have 

\f\x)\ < M 

for an appropriate constant M. Furthermore, |cos /u| S 1. It follows that 


l^nl 


m 2 
n tt 


C U 1 C 2M 

f ix) cos nx dx < — 5 — M dx = — 5 - 
J' trir /r 



Fig. 258. Left- and 
right-hand limits 

/0 - 0) = i, 

/( 1 + 0) =4 

of the function 
f x 2 if x < 1 
x/2 


^he left-hand limit of /(. x) at .v 0 is defined as the limit of /(.v) as x approaches jt 0 from the left 
and is commonly denoted by /(.v 0 - 0). Thus 

/(.v 0 - 0) = lim f(x 0 — / 1 ) as // — > 0 through positive values. 

hr-* 0 

The right-hand limit is denoted by f(x 0 + 0) and 

f(x 0 + 0) = lim /(.v 0 h) as h — > 0 through positive values, 
h— 0 

The left- and right-hand derivatives of fix) at .v 0 are defined as the limits of 

fix 0 ~h) - f(x 0 - 0) , f(x o ~r h) - f(x 0 + 0) 


-/z 


and 


/Ir- 


respectively, as /z 0 through positive values. Of course if fix) is continuous at .v 0t the last term in 
both numerators is simply /(a* 0 ). 
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Similarly, \b n \ < 2 Min 2 for all n. Hence the absolute value of each term of the Fourier 
series of /(a*) is at most equal to the corresponding term of the series 

W + 2 M ^1 + ] + -^2 + -^2 + 32 + 32 + ’ ’ 'j 

which is convergent. Hence that Fourier series converges and the proof is complete. 
(Readers already familiar with uniform convergence will see that, by the Weierstrass test 
in Sec. 15.5, under our present assumptions the Fourier series converges uniformly, and 
our derivation of (6) by integrating term by term is then justified by Theorem 3 of 
Sec. 15.5.) 

The proof of convergence in the case of a piecewise continuous function f(x ) and the 
proof that under the assumptions in the theorem the Fourier series (5) with coefficients 
(6) represents /(a*) are substantially more complicated; see, for instance. Ref. [C12]. ■ 

EXAMPLE 2 Convergence at a Jump as Indicated in Theorem 2 

The rectangular wave in Example 1 has a jump at x = 0. Its left-hand limit there is — k and its right-hand limit 
is k (Fig. 257). Hence the average of these limits is 0. The Fourier series (8) of the wave does indeed converge 
to this value when ,v = 0 because then all its terms are 0. Similarly for the other jumps. This is in agreement 
with Theorem 2. H 


Summary. A Fourier series of a given function /(a) of period 27ris a series of the form 
(5) with coefficients given by the Euler formulas (6). Theorem 2 gives conditions that are 
sufficient for this series to converge and at each a to have the value /(a), except at 
discontinuities of /(a), where the series equals the arithmetic mean of the left-hand and 
right-hand limits of /(a) at that point. 




1. (Calculus review) Review integration techniques for 
integrals as they are likely to arise from the Euler 
formulas, for instance, definite integrals of x cos nx, 
a 2 sin nx, e~ 2x cos nx, etc. 


2-3 


FUNDAMENTAL PERIOD 


Th & fundamental period is the smallest positive period. Find 
it for 


2. cos a, sin a, cos 2a, sin 2a, cos 7ta, sin tta, 
cos 2 tta, sin 2ttx 


3. cos nx t 


2 77 A 

sin ha, cos — ; — , 
k 


I'TTX 

sin — — , 
k 


2 77/? A 277// A 

cos — - — , sin — - — 
k k 


6. (Change of scale) If f(x) has period /?, show that f(ax), 
a 0, and f(x/b ), b 0, are periodic functions of a 
of periods p/a and bp , respectively. Give examples. 


7-12 


GRAPHS OF 27T-PERIODIC FUNCTIONS 


Sketch or graph /(a), of period 277, which for —77 < x < 77 
is given as follows. 

7. f(x) = X 8. fix) = e-W 

9. f(x) = ir — |.v| 10. f(x) = |sin 2x\ 


11. fix) = 

12 . /( x) = 


l-x 3 \f~1T<X<0 
[ x 3 if 0 < X < IT 
( 1 if — 77 < A < 0 

Icos \x if 0 < X < 77 


4. Show that / = const is periodic with any period but 
has no fundamental period. 

5. If f{x) and #(a) have period p t show that 
hi x) = a fix) 4* bgix) ( a , b, constant) has the period p. 
Thus all functions of period p form a vector space. 


13-24 


FOURIER SERIES 


Showing the details of your work, find the Fourier series 
of the given /(a), which is assumed to have the period 2tt. 
Sketch or graph the partial sums up to that including 
cos 5.v and sin 5a. 
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-n 0 n 

15. 



-7t OK 

18. 


-7t 0 K 

19. 



21. f(x ) = X 2 (“ 77 < X < 77) 

22. fix ) = A* 2 (0 < a < 2tt) 


f A* 2 if —\tT < A' < 577 

23. /(a) = 

l^77 2 if §77 < X < §77 





—4a* if —77 < a* < 0 

24. fix) = • 

. 4a if 0 < A < 77 

25. (Discontinuities) Verify the last statement in Theorem 
2 for the discontinuities of /(a) in Prob. 13. 

26. CAS EXPERIMENT. Graphing. Write a program for 
graphing partial sums of the following series. Guess 
from the graph what fix) the series may represent. 
Confirm or disprove your guess by using the Euler 
formulas. 

(a) 2(sin a + 5 sin 3a + 5 sin 5a 4- • • •) 

- 2(| sin 2a 4- § sin 4a 4- § sin 6a • • •) 
4 

(b) 5 4 « ( cos x + 9 cos 3 a + ^ cos 5a 4- • • •) 

77 

(c) §7 r 2 4- 4(cos a — | cos 2a 4- 5 cos 3a — jq cos 4a 
4 - - ■ ■ ■) 

27. CAS EXPERIMENT. Order of Fourier Coefficients. 
The order seems to be Un if / is discontinous, and 1/n 2 
if / is continuous but f = dfldx is discontinuous, l/n 3 
if / and f are continuous but f n is discontinuous, etc. 
Try to verify this for examples. Try to prove it by 
integrating the Euler formulas by parts. What is the 
practical significance of this? 

28. PROJECT. Euler Formulas in Terms of Jumps 
Without Integration. Show that for a function whose 
third derivative is identically zero. 



where n = l, 2, • • • and we sum over all the jumps j Sf 
jsJs of fy f'* respectively, located at a s . 

29. Apply the formulas in Project 28 to the function in 
Prob. 21 and compare the results. 

30. CAS EXPERIMENT. Orthogonality. Integrate and 
graph the integral of the product cos mx cos fix (with 
various integer m and n of your choice) from — a to a 
as a function of a and conclude orthogonality of cos 
mx and cos nx (m =£ n) for a = tt from the graph. For 
what m and n will you get orthogonality for a = tt/2, 
77/3, 77/4? Other al Extend the experiment to cos mx 
sin rix and sin mx sin /ia. 
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11.2 Functions of Any Period p = 2L 

The functions considered so far had period 2 tt, for the simplicity of the formulas. Of 
course, periodic functions in applications will generally have other periods. However, we 
now show that the transition from period p = 2ir to a period 2 L is quite simple. The 
notation p = 2L is practical because L will be the length of a violin string (Sec. 12.2) or 
the length of a rod in heat conduction (Sec. 12.5), and so on. 

The idea is simply to find and use a change of scale that gives from a function g( v) of 
period 2tt a function of period 2 L. Now from (5) and (6) in the last section with g(v) 
instead of /( x) we have the Fourier series 


( 1 ) 

with coefficients 

( 2 ) 


oo 

g(v) = a 0 4- 2 ( a n cos no + b n sin nv) 

n-1 


a o — 


Uyi 




i r 

2^L 8(V)dV 

i c 

— I g(v) cos nv do 

TT J -„ 

1 C 

— I g(v) sin nv dv. 

TT J -„ 


We can now write the change of scale as v — kx with k such that the old period v = 2ir 
gives for the new variable x the new period x = 2 L. Thus, 2n= k2L. Hence k = tt/L and 

(3) v — kx — t ix/L. 

This implies dv = (tt/L) dx, which upon substitution into (2) cancels 1/2 tt and Mtt and 
gives instead the factors 1/2L and ML. Writing 

(4) g(v) = f(x), 

we thus obtain from (1) the Fourier series of the function f(x) of period 2 L 


(5) 


f(x) = a 0 + 2 {a n cos x + b n sin —■ 


with the Fourier coefficients of f(x) given by the Euler formulas 


(a) 


1 

r L 


a 0 = 

2L 

1 fix) dx 

J -L 


(b) 


1 

1 v mrx 


a n = 

L J 

1 fix) cos dx 

-L L 

n= 1,2,-- 

(c) 

b n = 

iJ 

r L ,, . . nirx 
f(x) sin — — dx 
—L L 

« = 1,2,*- 
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CHAP. 11 Fourier Series, Integrals, and Transforms 


EXAMPLE 1 


EXAMPLE 2 


Just as in Sec. 1 1.1, we continue to call (5) with any coefficients a trigonometric series. 
And we can integrate from 0 to 2 L or over any other interval of length p = 2 L. 

Periodic Rectangular Wave 

Find the Fourier series of the function (Fig. 259) 

'0 if — 2 < a* < — l 

/(*) = •* if -l<*< 1 p = 2L = 4, L = 2. 

.0 if 1 < a < 2 

Solution . From (6a) we obtain a 0 = kJ2 (verify!). From (6b) we obtain 
2 1 

1 f v MTX if. niTX , 2k MT 

a " = 2 J./ W C0s — dx = 2 J.,* C0S — * = ^ s,n T ■ 

Thus a n — 0 if n is even and 

a n = 2k/mr if n = 1, 5, 9, • • • , o n = -2klmr if n = 3. 7, 1 1, • • • . 

From (6c) we find that b n = 0 for n = I, 2, • • * . Hence the Fourier series is 


/<*) 


k 2k ( 7T 1 3tt 1 57t \ 

= 2 + "5T \ cos T v " I cos ~ x + ? cos ~ x - + •■ )■ 




fix) 

ft 





! 



1 

r 

- 2-10 ] 

L 2 * 


Fig. 259. Example 1 


Periodic Rectangular Wave 

Find the Fourier series of the function (Fig. 260) 


[-k if -2 < jc < 0 

f{x) =• | p = 2L = 4, L- 2. 

Ik if 0 < a* < 2 

Solution . a 0 = 0 from (6a). From (6b), with I !L = 1/2, 


1 f f° MTX f 2 MTX I 

a 7i = 7 I J i-k) cos ~Y~ dx + J k cos -J- dx J 

if 2k mtx 1° 2k mtx | 2 1 

= — I sin — + — sin — — I = 0, 

2 L nir 2 j _ 2 nir 2 0 J 


so that the Fourier series has no cosine terms. From (6c), 


i 

" 2k 

MTX 

0 

2k MTX 

2“ 

2 

MT 

COS ~ — 
2 

-2 

— cos - — 
MT 2 

lo_ 


= (I — COS MT — COS MT + 

MT 


[Akim: if n = 1, 3, ■ 

D = 

l 0 if n = 2, 4, • ■ 
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EXAMPLE 3 


Hence the Fourier series of f(x) is 
4k 


4k ( 7T 1 3*77 1 5*77 \ 

fix) = — I sin — X + y sin —a* + y sin —a + • • • I . 

It is interesting that we could have derived this from (8) in Sec. 11.1, namely, by the scale change (3). Indeed, 
writing v instead of a, we have in (8), Sec. 1 1.1, 


4k / 

— I sin t? 

TT \ 


1 1 

+ — sin 3v 4* — sin 5v + 




Since the period 2ir in u corresponds to 2 L = 4, we liave k = i tJL = tt/ 2 and v = kx = ttx/2 in (3); hence we 
obtain the Fourier series of /(a), as before. ■ 



u(t) 


Fig. 260. Example 2 

Half-Wave Rectifier 


-T:f(o 0 id® 

Fig. 261. Half-wave rectifier 


A sinusoidal voltage E sin oji. where / is time, is passed through a half-wave rectifier that clips the negative 
portion of the wave (Fig. 261). Find the Fourier series of the resulting periodic function 


«(/) 


r o if -z, < / < o, 

l£ sin cot if 0 < t < L 


2tt 

p = 2L = , L = 

a> 


Solution. Since u = 0 when -L< t < 0, we obtain from (6a), with t instead of a. 


a Q 


= — f 

2ir J n 


E sin cor dr = — 


J . 7 ! 

n 


and from (6b), by using formula (11) in App. A3. 1 with x = cot and y — not, 

-tHu* _ -7r/w 

r 

£ sin <*>/ cos dt — — — I [sin (1 + »)«/ + sin (l — /?)cu/] rfr. 
“'0 277 •'o 

If w = 1 , the integral on the right is zero, and if n = 2, 3, • * ♦ , we readily obtain 

rrlta 

0 


_ (oE r cos (1 + n)<ot cos (1 — n)<ot | 1 

° n 27 t |_ (1 -I- n)co (1 - n)o J c 

_ E I —cos (1 + n)ir 4- 1 -cos (1 - rt) tt -I- 1 \ 
2tt \ 1 4- n 1 - n / 


1)7T 


If n is odd, this is equal to zero, and for even n we have 

El 2 2 \ 2 E 

° n 2tt \ 1 + n + I - n j (n — l)(/j + 

In a similar fashion we find from (6c) that b x = £72 and £ n = 0 for « = 2, 3, • • • . Consequently, 

2E ( i I \ 

"Tin cos2& " + n cos4ft " + " j • 


(n = 2,4, •••)• 


E E 

«(0 = — + — sin eu/ • 

77 2 
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CHAP. 11 Fourier Series, Integrals, and Transforms 


j JM -5 E T -E3EE3E 


1 1-11 1 FOURIER SERIES FOR PERIOD p = 2L 

Find the Fourier series of the function fix), of period p = 2L, 
and sketch or graph the first three partial sums. (Show the 
details of your work.) 

1. fix) = -1 (-2 < .v < 0), fix ) = 1 (0 < a* < 2), p = 4 

2. fix) = 0 (-2 < a* < 0), fix) = 4 (0 < a < 2), p = 4 

3. f(x) = x 2 (-!<*<]), p = 2 

4. /(jc) = tta- 3 /2 (-1 < jt < !), p = 2 

5. fix) = sin tta* (0 < x < 1 ), p — 1 

6. /(a) = COS 7TX ( < X < ^), p = 1 

|.v| (— 1 < A* < 1), p = 2 

1 + A* if - 1 < A* < 0 _ 9 

I — x if 0 < A- < 1, P - 1 

9. fix) = I - a 2 (—1 < x < I), p = 2 

10. /(a) = 0 (-2 < A- < 0), /(a) = a (0 < a < 2), p = 4 

11. /(a)=-a (- 1 < A < 0), /(A)=A (0 < A < 1 ), 

/(a) =1 (1 < A < 3), /? = 4 


7. /(A) = 

8. fix) = 


12. (Rectifier) Find the Fourier series of the function 
obtained by passing the voltage u(t) = V 0 cos 10077/ 
through a half-wave rectifier. 

13. Show that the familiar identities 
cos 3 x = | cos a + \ cos 3a and 

sin 3 a = | sin a — \ sin 3a can be interpreted as 
Fourier series expansions. Develop cos 4 a. 


14. Obtain the series in Prob. 7 from that in Prob. 8. 

15. Obtain the series in Prob. 6 from that in Prob. 5. 

16. Obtain the series in Prob. 3 from that in Prob. 21 of 
Problem Set 11.1. 

17. Using Prob, 3, show that 

I _ I , 1 !_ 4 . . _ J__2 

1 4*9 16 * “ 12 77 ’ 

18. Show that 1 + 4 + | + ^+ ,,, = g7r 2 . 

19. CAS PROJECT. Fourier Series of 2L-Periodic 
Functions, (a) Write a program for obtaining partial 
sums of a Fourier series (1). 

(b) Apply the program to Probs. 2-5, graphing the first 
few partial sums of each of the four series on common 
axes. Choose the first five or more partial sums until 
they approximate the given function reasonably well. 
Compare and comment. 

20. CAS EXPERIMENT. Gibbs Phenomenon. The 

partial sums .v n (A*) of a Fourier series show oscillations 
near a discontinuity point. These oscillations do not 
disappear as n increases but instead become sharp 
“spikes.” They were explained mathematically by 
J. W. Gibbs 3 . Graph 5 u (a) in Prob. 10. When n - 50, 
say, you will see those oscillations quite distinctly. 
Consider other Fourier series of your choice in a similar 
way. Compare. 


11.3 Even and Odd Functions. 

Half-Range Expansions 

The function in Example 1, Sec. 11.2, is even , and its Fourier series has only cosine 
terms. The function in Example 2, Sec. 11.2, is odd, and its Fourier series has only sine 
terms. 

Recall that g is even if g(— a) = g( a), so that its graph is symmetric with respect to the 
vertical axis (Fig. 262). A function h is odd if h(—x) = —//(a) (Fig. 263). 

Now the cosine terms in the Fourier series (5), Sec. 11.2. are even and the sine terms 
are odd. So it should not be a surprise that an even function is given by a series of 
cosine terms and an odd function by a series of sine terms. Indeed, the following holds. 


3 JOSIAH WILLARD GIBBS (1839-1903), American mathematician, professor of mathematical physics at 

Yale from 1871 on. one of the founders of vector calculus [another being O. Heaviside (see Sec. 6.1)], 
mathematical thermodynamics, and statistical mechanics. His work was of great importance to the development 
of mathematical physics. 
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THEOREM 1 


Fourier Cosine Series, Fourier Sine Series 

The Fourier series of an even function of period 2 L is a “Fourier cosine series” 


a) 


f(x) = a 0 + X COS X 
«=1 L 


(/ even) 


with coefficients (note: integration from 0 to L only!) 


1 C L 2 f L nirx 

(2) a 0 = — J f(x) dx, a n = — f(x) cos -j- dx , n = 1, 2, • • • . 


77ze Fourier series of an odd function of period 2 L is a “Fourier sine series’ 5 


(3) 

with coefficients 


fix) = 2 K sin — x if odd) 

n= 1 L 


(4) 



dx. 


PROOF Since the definite integral of a function gives the area under the curve of the function 
between the limits of integration, we have 


f gix) dx = 2 J gix) dx 

-L J 0 



for even g 


for odd h 


as is obvious from the graphs of g and h . (Give a formal proof.) Now let / be even. Then 
(6a), Sec. 1 1.2, gives a 0 in (2). Also, the integrand in (6b), Sec. 11.2, is even (a product 
of even functions is even), so that (6b) gives a^ in (2). Furthermore, the integrand in (6c), 
Sec. 11.2, is the even / times the odd sine, so that the integrand (the product) is odd, the 
integral is zero, and there are no sine terms in (1). 
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THEOREM 2 


EXAMPLE 1 


EXAMPLE 2 


Similarly, if f is odd, the integrals for a 0 and a n in (6a) and (6b), Sec. 1 1.2, are zero, 
f times the sine in (6c) is even, (6c) implies (4), and there are no cosine terms in (3). ■ 


The Case of Period 2m If L = tt, then f(x) 
coefficients 


a 0 + 2 a n cos nx (/ even) with 

n« 1 


(2*) 


1 r v 2 r~ 

a 0 = — I f(x) dx , a n = — I /(jc) cos nx dx , n = 1, 2, 

7T *^o TT Jn 


_2 

■7T ■'o 


OC 

and /(*) = ^ sin wa: (/ odd) with coefficients 

n=l 


(4*) 


2 r 77 

b n = — I /(x) sin nx dx , 
7T * , n 


« = 1 , 2 , 


For instance, /(*) in Example 1, Sec. 11.1, is odd and is represented by a Fourier sine 
series. 

Further simplifications result from the following property, whose very simple proof is 
left to the student. 


Sum and Scalar Multiple 

The Fourier coefficients of a sum f\ -f f 2 are the sums of the corresponding Fourier 
coefficients of f x and f 2 . 

The Fourier coefficients of cf are c times the corresponding Fourier coefficients 
off. 


Rectangular Pulse 

The function f*(x) in Fig. 264 is the sum of the function /(.v) in Example l of Sec 11.1 and the constant k. 
Hence, from that example and Theorem 2 we conclude that 


/*to = 



/ 1 1 
I sin -v + — sin 3x + — sin 5.v + 



Half-Wave Rectifier 

The function w(r) in Example 3 of Sec. 1 1.2 has a Fourier cosine series plus a single term o(t) = (E/2) sin cot. 
We conclude from this and Theorem 2 that u(t) - v(t) must be an even function. Verify this graphically. (See 
Fig. 265.) ■ 


f*(x) 

2k 








-n 0 it 2n 3n 4 n * 


Fig. 264. Example 1 
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EXAMPLE 3 


Sawtooth Wave 

Find the Fourier series of the function (Fig. 266) 

fix) = x + 7T if - 7 r < x < 7 r and fix + 27 r) = fix). 



(a) The function fix) 



( b ) Partial sums Sj, S 2 , S 3 , S 20 
Fig. 266. Example 3 


Solution. We have / = /1 + / 2 , where f 1 = a* and / 2 = 'J 7 - The Fourier coefficients of f 2 are zero, except 
for the first one (the constant term), which is 7 r. Hence, by Theorem 2, the Fourier coefficients a n , b n are those 
of/!, except for a Qy which is 7 r. Since f x is odd, a n = 0 for n = l, 2, • • • , and 


b 


n 


2 _ 

7T 


I /i(a) sin nx dx 
J o 


2 r ■ 

— I X Sll 
7T 


sin nx dx. 


Integrating by parts, we obtain 



2 

— cos mr. 
n 


Hence b\ = 2 , b 2 = —2/2, b 2 = 2/3, /> 4 = -2/4, • - • , and the Fourier series of fix) is 


/ 1 1 

/(„v) = 7r + 2 I sin x - — sin 2 a + — sin 3 a - + 


Half-Range Expansions 

Half-range expansions are Fourier series. The idea is simple and useful. Figure 267 
explains it. We want to represent fix) in Fig. 267a by a Fourier series, where f(x) may 
be the shape of a distorted violin string or the temperature in a metal bar of length L, for 
example. (Corresponding problems will be discussed in Chap. 12.) Now comes the idea. 
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EXAMPLE 4 



0 L/2 L x 

Fig. 268. The given 
function in Example 4 



L x 


(a) The given function fix) 

f x ix) I 



(c) fix) extended as an odd periodic function of period 2 L 
Fig. 267. (a) Function f{x) given on an interval 0 ^ x ^ L 

(b) Even extension to the full “range” (interval) —L^xSL (heavy curve) 
and the periodic extension of period 2 L to the x-axis 

(c) Odd extension to — L ^ x ^ L (heavy curve) and the periodic extension 
of period 2 L to the x-axis 


We could extend f(x) as a function of period L and develop the extended function into a 
Fourier series. But this series would in general contain both cosine and sine terms. We 
can do better and get simpler series. Indeed, for our given f we can calculate Fourier 
coefficients from (2) or from (4) in Theorem 1. And we have a choice and can take what 
seems more practical. If we use (2), we get (1). This is the even periodic extension 
of f in Fig. 267b. If we choose (4) instead, we get (3), the odd periodic extension f 2 of 
/ in Fig. 267c. 

Both extensions have period 2 L. This motivates the name half-range expansions: f is 
given (and of physical interest) only on half the range, half the interval of periodicity of 
length 2 L. 

Let us illustrate these ideas with an example that we shall also need in Chap. 12. 

“Triangle” and Its Half-Range Expansions 

Find the two half-range expansions of the function (Fig. 268) 

— x if 0 <.y<-^- 

L 2 

f(X) = 2* i 

=j;(L-x) if 2 <x<l - 


Solution . (a) Even periodic extension . From (2) we obtain 

„L/2 „L 

I I 2* f 
a 0 


l [" 2 k fU 2 2k f L 

,= I [t J 0 xdx+ T V 


(L - x) dx 


2 2k f U7T 2k f ti7r 

Q n = — — x cos x dx + — I (L — x) cos —x dx 

L L L j 0 L L J L!2 •* 
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We consider a n . For the first integral we obtain by integration by parts 


M2 

7177 

Lx 

mr 

L/2 

L ( 

x cos 

— x dx — 

: — sin 

-V 

— 

‘ 

'0 

L 

mr 

L 

0 

mr J 0 


,2 r 2 

L mr L 

— — sin — + » 0 

2mt 2 n 2 7T 2 


( HIT \ 

cos — - Ij 


Similarly, for the second integral we obtain 
M 


mr 

sin x dx 


f mr L mr \ L L C mr 

(L — x) cos — — a* dx = — (L — x) sin — - a* 4- — I sin — — A’ dx 

->U2 L n7r 1 \ L/2 nir J u 2 L 

( L ( L\ mr\ L? / mr \ 

\ nir \ 2) 2 ) n 2 tt 2 \ 2 ) 

We insert these two results into the formula for a n . The sine terms cancel and so does a factor l?. This gives 

4 k { mr \ 

a n — 9 9 2 COS COS «1T — 1 I . 

HTT \ 2 } 

Thus, 

a z = -l6k/(2 2 -n*), a 6 = -I6*/(6V). a 10 = -I6W(10 2 tt 2 ). • • • 

and a n = 0 if n =£ 2. 6, 10, 14, • • • . Hence the first half-range expansion of f(x) is (Fig. 269a) 

A 16 k /I 2 77 1 677 \ 

m = I " {¥ cos ~T X * e cos T x + ‘ ‘ j • 

This Fourier cosine series represents the even periodic extension of the given function /(.*), of period 2 L. 

(b) Odd periodic extension. Similarly, from (4) we obtain 

8 A mr 

(5) b n = -j-p sin — . 

Hence the other half-range expansion of /(*) is (Fig. 269b) 

8 A / 1 77 1 377 1 577 \ 

/to = —2 72 sin t x - sin ~r x + 72 sin ~r x ~ + ‘ • • • 


/ I it I 3tt 1 5tt \ 

I — s* sin — a* 7 sin — - x + —5* sin ——a* — h • • • I 

\ I 2 L 3 2 L 5 2 L ) 


This series represents the odd periodic extension of f(x ). of period 2 L. 
Basic applications of these results will be shown in Secs. 12.3 and 12.5. 



-L 0 L x 

(a) Even extension 



(b) Odd extension 

Fig. 269. Periodic extensions of /(*) in Example 4 
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PROBLEM SET 11.3 


7-9] EVEN AND ODD FUNCTIONS 

Are the following functions even, odd, or neither even nor 
odd? 

1. |jt|, a 2 sin /?a, x + a 2 , e~* x K In a, a cosh a 

2. sin (a 2 ), sin 2 a, a sinh a, |a 3 |, e ™, xe x , tan 2a, a/(1 + a 2 ) 
Are the following functions, which are assumed to be 
periodic of period 27 t. even, odd. or neither even nor odd? 

3. /(A) = A 3 ( “ 7T < a < 7f) 

4. f(x ) = a 2 (-77/2 < a < 3 77/2) 

5. fix) = e~ 4x (-7T < A < 77) 

6. /(a) = a 3 sin a ( — 7 T < a < 7 r) 

7. fix) = a|a| — a 3 ( — 7T < A < 77) 

8. fix) = 1 — A + A 3 — A 5 (— 77 < A < 77) 

9. fix) = 1/(1 + A 2 ) if — 77 < A < 0, fix) = “1/(1 + A 2 ) 

if 0 < A < 77 

10. PROJECT. Even and Odd Functions, (a) Are the 
following expressions even or odd? Sums and products 
of even functions and of odd functions. Products of 
even times odd functions. Absolute values of odd 
functions, fix) + /(-a) and fix) — /(—a) for arbitrary 
/(*). 

(b) Write J ex 9 1/(1 - a), sin (a + k). cosh (a + k) as 
sums of an even and an odd function. 

(c) Find all functions that are both even and odd. 

(d) Is cos 3 a even or odd? sin 3 a? Find the Fourier 
series of these functions. Do you recognize familiar 
identities? 

1 11-16 1 FOURIER SERIES OF EVEN AND ODD 
FUNCTIONS 

Is the given function even or odd? Find its Fourier series. 
Sketch or graph the function and some partial sums. (Show 
the details of your work.) 

11. fix) = 7 T — |.v| (— 77 < A < 77) 


12. fix) = 2a|a| (-1 < A< I) 

r a if — 77/2 < a < 77/2 

13. fix) = 

L 77 — A if 77/2 < A < 377/2 
f7 re' x if 77 < a < 0 


{ 77 <?~ 

ire x 


15. fix) = 

16. fix) = 


7 Te X if 0 < A < 77 

2 if -2 < x < 0 

0 if 0 < a < 2 

1 - £|a| if -2 < a < 2 

0 if 2 < a < 6 


(P = 8 ) 


1 7-25 HALF-RANGE EXPANSIONS 

Find (a) the Fourier cosine series, (b) the Fourier sine series. 
Sketch f(x) and its two periodic extensions. (Show the 
details of your work.) 

17. fix) =1 (0 < a < 2) 

18. /(a) = x (0 <x< i) 


19. fix) = 2 - 

- x (0 < x < 2) 

fO 

(0 < x < 2) 

20. fix) = 


ll 

(2 < a < 4) 

ri 

(0 < A < 1) 

21. fix) = 


12 

(1 < A < 2) 

f A (0 < X < 77/2) 

22. fix) = 


I 77/2 ( 77/2 < A < 77 ) 

23. fix) = a: 

(0 < a < L) 

24. f(x) = x 2 

(0 < A < L) 


25. f(x) = IT — .V (0 < A < 7 r) 

26. Illustrate the formulas in the proof of Theorem 1 with 
examples. Prove the formulas. 


11.4 Complex Fourier Series. Optional 

In this optional section we show that the Fourier series 

3C 

( 1 ) fix) = a 0 + 2 ( a n c °s nx + b n sin nx) 

71=1 

can be written in complex form, which sometimes simplifies calculations (see Example 1, 
on page 498). This complex form can be obtained because in complex, the exponential 
function e xt and cos t and sin t are related by the basic Euler formula (see (11) in Sec. 2.2) 

(2) e n = cos t + i sin /. Thus e~ zt = cos t — i sin /. 
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Conversely, by adding and subtracting these two formulas, we obtain 

(3) (a) cos t = e lt + e~ lt ), (b) sin t = ^X e%t ~ eT 11 ), 

From (3), using Hi = — / in sin 1 and setting t = hjc in both formulas, we get 

c/ n cos nx + b n sin ;u* = — ci n (e inx + e~ inx ) + — b n {e inx — e“ tnx ) 

jL 

= J («n - ib n )e inx + j (a n + ib n )e~ inx . 

We insert this into (1). Writing a 0 = c 0 , |(a„ — ib n ) = c n , and \(a n + ib n ) = k n , 
we get from (I) 

oc 

(4) m = Co + 2 (c n e inx + k n e~ inx ). 

n= 1 

The coefficients c 1? c 2 , • • • , and k x , k 2 , • • • are obtained from (6b), (6c) in Sec. 11.1 and 
then (2) above with / = nx. 


i i r If", 

C„ = — («„ - ibn) = Y~ J f( x X cos ,vc - ‘ sm ,ur ) dx = — J f(x)e 


: dx 


(5) 

K = -J (On + &«) = / /(•*)(' cos nx + i sin nx) dx - J f(x)e i7UC dx. 

Finally, we can combine (5) into a single formula by the trick of writing k n = c_ n . Then 
(4), (5), and c 0 = a 0 in (6a) of Sec. 11.1 give (summation from — »>!) 


( 6 ) 


m = 2 c n e ina , 

n=*— co 

c n = y- J f(x)e~ inx dx, n . = 0, ±1, ±2, • • • . 


This is the so-called complex form of the Fourier series or, more briefly, the complex 
Fourier series, of f(x). The c„ are called the complex Fourier coefficients of f(x). 

For a function of period 2 L our reasoning gives the complex Fourier series 


( 7 ) 


fix) = 2 c n e in **' L , 

n=— oc 

Cn= 2 7 f f^ e ~ inm/L dx ' « = 0 , ± 1 , ± 2 , • • • . 
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EXAMPLE 1 


Complex Fourier Series 

Find the complex Fourier series of /(.v) = e x if -it < x < tt and f(x + 27 r) = f(x) and obtain from it the usual 
Fourier series. 

Solution . Since sin nir = 0 for integer /?, we have 

e ±lU7r = cos rnr ± i sin hit = cos /itt = (- l) n . 


With this we obtain from (6) by integration 




Itt 1 — in 


= zr -«■')(- D n . 

27 r 1 — in 


On the right. 


1 


1 + in 


1 + in 


1 - in (1 - / n)( 1 + in) \ + n 2 
Hence die complex Fourier series is 


( 8 ) 


sinh 7 r 


and 


1 + //i 




= 2 sinh 7r. 


2 (-i) n - L - L ^ 1JW * 


1 + /!' 


2 


(-7T < JC < 7r). 


From this let us derive the real Fourier series. Using (2) with t = nx and / 2 = — 1, we have in (8) 

(1 + in)e tnx = (I + /«)(cos nx + i sin nx) = (cos nx — n sin nx) + i(n cos /uc + sin nx). 

Now (8) also has a corresponding term with —n instead of n. Since cos (-/lv) = cos nx and 
sin (— /ly) = -sin hjc, we obtain in this term 

(1 — in)e~ inx = (1 — //i)(cos nx — / sin nx) = (cos nx — n sin nx) — i(n cos nx + sin nx). 

If we add these two expressions, the imaginary parts cancel. Hence their sum is 

2(cos nx — n sin /lt), n = 1,2,' 

For n — 0 we get 1 (not 2) because there is only one term. Hence the real Fourier series is 


(9) 


2 sinh 7 r 


"_ 1 _ 1 __ 

_ 2 1 + 1 


1 


2 (cos x — sin x) + (cos 2x — 2 sin 2x) — f • • • 


In Fig. 270 the poor approximation near the jumps at ±7 r is a case of the Gibbs phenomenon (see CAS 
Experiment 20 in Problem Set 1 1 .2). H 



Fig. 270. Partial sum of (9), terms from n = 0 to 50 
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1. (Calculus review) Review complex numbers. 

2. (Even and odd functions) Show that the complex 
Fourier coefficients of an even function are real and 
those of an odd function are pure imaginary. 

3. (Fourier coefficients) Show that 

a Q ~~ ~ c — bn K c n c — n)* 

4. Verify the calculations in Example 1 . 

5. Find further terms in (9) and graph partial sums with 
your CAS. 

6. Obtain the real series in Example 1 directly from the 
Euler formulas in Sec. 11. 


7-13 


COMPLEX FOURIER SERIES 


Find the complex Fourier series of the following functions. 
(Show the details of your work.) 

7. f(x) = — lif— 7T<X<0, f(x ) = 1 if 0 < X < 7T 

8. Convert the series in Prob. 7 to real form. 

9. f(x) — x (-7T < x < n) 


10. Convert the series in Prob. 9 to real form. 

11. jf(A') = X 2 (— 7T < x < 7T) 

12. Convert the series in Prob. 1 1 to real form. 

13. f(x) = x (0 < x < 27 r) 


14. PROJECT. Complex Fourier Coefficients. It is very 
interesting that the c n in (6) can be derived directly by 
a method similar to that for a n and b n in Sec. 11.1. For 
this, multiply the series in (6) by e’" imx with fixed 
integer m, and integrate termwise from — 7r to 7? on 
both sides (allowed, for instance, in the case of uniform 
convergence) to get 




gi(n—m)x 


dx. 


Show that the integral on the right equals 2tt when 
n = rn and 0 when n # m [use (3b)], so that you get 
the coefficient formula in (6). 


1.5 Forced Oscillations 


Fourier series have important applications in connection with ODEs and PDEs. We show 
this for a basic problem modeled by an ODE. Various applications to PDEs will follow 
in Chap. 12. This will show the enormous usefulness of Euler’s and Fourier’s ingenious 
idea of splitting up periodic functions into the simplest ones possible. 

From Sec. 2.8 we know that forced oscillations of a body of mass m on a spring of 
modulus k are governed by the ODE 

(1) my" + cy f + ky = r(t) 

where y = y (f) is the displacement from rest, c the damping constant, k the spring constant 
(spring modulus), and r{t) the external force depending on time t. Figure 271 shows the 
model and Fig. 272 its electrical analog, an RLC - circuit governed by 



Fig. 271. Vibrating system under 
consideration 


Fig. 772. Electrical analog of the 
system in Fig. 271 (RLC-cir cuit) 
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EXAMPLE T 


(1*) LI" + Rl' + 7 = E'(t) (Sec. 2.9). 

We consider (1). If r(f) is a sine or cosine function and if there is damping (c > 0), 
then the steady-state solution is a harmonic oscillation with frequency equal to that of r(t). 
However, if /*(/) is not a pure sine or cosine function but is any other periodic function, 
then the steady-state solution will be a superposition of harmonic oscillations with 
frequencies equal to that of ;•(/) and integer multiples of the latter. And if one of these 
frequencies is close to the (practical) resonant frequency of the vibrating system (see 
Sec. 2.8), then the corresponding oscillation may be the dominant part of the response of 
the system to the external force. This is what the use of Fourier series will show us. Of 
course, this is quite surprising to an observer unfamiliar with Fourier series, which are 
highly important in the study of vibrating systems and resonance. Let us discuss the entire 
situation in terms of a typical example. 


Forced Oscillations under a Nonsinusoidal Periodic Driving Force 

In (I), let m = 1 (gm), c = 0.05 (gm/sec), and k = 25 (gm/sec 2 ), so that (1) becomes 


(2) /' + 0.05/ + 25 y = r(f) 

where /■(/) is measured in gm • cm/sec 2 . Let (Fig. 273) 


m = 


t + — if -7T</<0, 
2 


-f + Y if 0 <t <TT, 


r(/ + 2 tt) = r(/). 


Find the steady-state solution y(/). 


r(t) 



Fig. 273. Force in Example 1 


Solution . We represent r{t) by a Fourier series, finding 

4/1 1 \ 

(3) r(t) — — I cos t -f- ^ cos 3t + -rg cos 5/ 4- • • • I 

(lake the answer to Prob. 1 1 in Problem Set 1 1.3 minus §7rand write t for x). Then we consider the ODE 

(4) v" + 0.05/ + 25 v = -4~ cos m (n = i, 3, • • •) 

n 7r 

whose right side is a single term of the series (3). From Sec. 2.8 we know that the steady-state solution y n (t) 
of (4) is of the form 


(5) 


y n = A n cos nt + B n sin nt. 
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By substituting this into (4) we find that 

4(25 — h 1 2 ) 0.2 9 

(6) A n = — g— — , B n = — — , where D n = (25 - + (0.05 nf. 

/i ttZ/jj, 


Since the ODE (2) is linear, we may expect the steady-state solution to be 


(7) .v = .Vi + >’3 + >’5 + * ‘ ‘ 

where y n is given by (5) and (6). In fact, this follows readily by substituting (7) into (2) and using the Fourier 
series of /•(/), provided that termwise differentiation of (7) is permissible. (Readers already familiar with die notion 
of uniform convergence ISec. 15.51 may prove that (7) may be differentiated term by term.) 

From (6) we find that the amplitude of (5) is (a factor Vz>^ cancels out) 


C n = V/t tt 2 + B n 2 = 


4 


n 2 7r\/ r D^ 


Numeric values are 


Cj = 0.0531 
C 3 = 0.0088 

C 5 = 0.2037 

C 7 = 0.001 1 
C 9 = 0.0003. 

Figure 274 shows the input (multiplied by 0.1) and the output. For n = 5 the quantity D n is very small, the 
denominator of C 5 is small, and C 5 is so large that y 5 is the dominating term in (7). Hence the output is almost 
a harmonic oscillation of five times the frequency of the driving force, a little distorted due to the term y lt whose 
amplitude is about 25% of that of y 5 . You could make the situation still more extreme by decreasing the damping 
constant c. Try it. M 





e 



1. (Coefficients) Derive the formula for C n from A n and B n . 

2. (Spring constant) What would happen to the amplitudes 
C n in Example l (and thus to the form of the vibration) 
if we changed the spring constant to the value 9? If we 
took a stiffer spring with k = 81? First guess. 


3. (Damping) In Example 1 change c to 0.02 and discuss 
how this changes the output. 

4. (Input) What would happen in Example 1 if we 
replaced r(t) with its derivative (the rectangular wave)? 
What is the ratio of the new C n to the old ones? 
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5-1 1[ GENERAL SOLUTION 

Find a general solution of the ODE y " + o?y = r(t) with 

r(/) as given. (Show the details of your work.) 

5. r(/) = cos <ot , o) = 0.5, 0.8, 1.1, 1.5, 5.0, 10.0 

6 . /*(/) — cos a ) x t + cos a> 2 t (o) 2 w x 2 , (o 2 2 ) 

N 

7. /*(/) = 2 cos M =£ 1, 2, • • • , N 

n= 1 

8 . /*(/) = sin / + 5 sin 3/ 4- ^ sin 5/ 4- ^ sin 7/ 

r / +7T if — 7T < / < 0 

9. r(f) = ■ 

. — / 4“ 7T if 0 < / < 7T 
and /*(/ 4- 27 t) = /*(/), \(o\ ^ 0, 1, 3, * • • 

f / if — 7 t/2 < / < 7t/2 

10 . r(r) = 

l7r — r if 7r/2 < / < 37 r /2 
and r(/ 4- 27 t) = r(/), |o>| =£ 1,3,5,--- 

11 . /*(/) = — |sin /| if — tt < t < tt and 

/■(/ 4* 27t) = /■(/), |o>| =f 0, 2. 4. • • 4 

12. (CAS Program) Write a program for solving the ODE 
just considered and for jointly graphing input and 
output of an initial value problem involving that ODE. 
Apply the program to Probs. 5 and 9 with initial values 
of your choice. 

13. (Sign of coefficients) Some A n in Example 1 are positive 
and some negative. Is this physically understandable? 


1 14—17 [ STEADY-STATE DAMPED OSCILLATIONS 

Find the steady-state oscillation of y n 4- cy f 4- y = r(/) 
with c > 0 and /*(/) as given. (Show the details of your 
work.) 

14. r(f) = a n cos nr 

15. r(t) = sin 3 1 

f TTt if — 7772 < / < tt} 2 

16 . /*(/) = 

ItK'tt — t) if tt/2 < t < 3tt/ 2 
and r(t 4- 27t) = r(t) 

N 

17. r(/) = 2 sin ni 

n« 1 

18. CAS EXPERIMENT. Maximum Output Term. 

Graph and discuss outputs of y" 4- cy 4- ky = r(t) 
with r(/) as in Example 1 for various c and k with 
emphasis on the maximum C n and its ratio to the 
second largest |C n |. 

19-20 1 RLC- CIRCUIT 

Find the steady-state current I(t) in the RLC - circuit in 
Fig. 272, where R = 100 Cl, L = 10 H, C = 10 ” 2 F and 
E(t) V as follows and periodic with period 2i r. Sketch or 
graph the first four partial sums. Note that the coefficients 
of the solution decrease rapidly. 

19. E(t) = 200/ ( 7 r 2 - t 2 ) (-n < t < tt) 

'100(7 Tt 4“ r 2 ) if — 7r < t < 0 

20. E{t) = • 

LlOO(7T/ — t 2 ) if 0 < / < 7T 


11.6 Approximation by Trigonometric Polynomials 

Fourier series play a prominent role in differential equations. Another field in which they 
have major applications is approximation theory, which concerns the approximation of 
functions by other (usually simpler) functions. In connection with Fourier series the idea 
is as follows. 

Let fix) be a function on the interval — 77 ^ x ^ 7r that can be represented on this 
interval by a Fourier series. Then the Nth partial sum of the series 

N 

(1) f(x) a 0 4- 2 ( a n cos fVC + b n sin nx) 

n=l 

is an approximation of the given /(a*). It is natural to ask whether (1) is the “best” 
approximation of / by a trigonometric polynomial of degree N, that is, by a function 
of the form 

N 

(2) Fix) = A 0 4-2 (A n cos nx 4- B n sin nx) (N fixed) 

n= 1 

where “best” means that the “error” of the approximation is as small as possible. 



SEC. 11.6 


Approximation by Trigonometric Polynomials 


503 


Of course, we must first define what we mean by the error E of such an approximation. 
We could choose the maximum of |/ — F|. But in connection with Fourier series it is 
better to choose a definition that measures the goodness of agreement between f and 
F on the whole internal — j tt ^ x ^ tt. This seems preferable, in particular if / has jumps: 
F in Fig. 275 is a good overall approximation of /, but the maximum of |/ — F| (more 
precisely, the supremum) is large (it equals at least half the jump of / at x 0 ). We choose 

(3) E = f (f-Ffdx. 


This is called the square error of F relative to the function / on the interval —n t 
Clearly, E ^ 0. 

N being fixed, we want to determine the coefficients in (2) such that E is minimum. 
Since (/ - F) 2 = f 2 - 2 fF + F 2 , we have 


(4) 



2 f fFdx + J F 2 dx. 

— 7 T —IT 


We square (2), insert it into the last integral in (4), and evaluate the occurring integrals. 
This gives integrals of cos 2 nx and sin 2 nx (n ^ 1), which equal 7 r, and integrals of 
cos nx, sin nx, and (cos w;c)(sin mx ), which are zero (just as in Sec. 11.1). Thus 



t r r n 

J + 2 (a» 

_ 7r L n=l 


cos nx + B n sin nx) 


dx 


= '7t( 2A 0 2 + Ax 2 4- • • • + A n 2 + Bj 2 + • • • + B n 2 ). 


We now insert (2) into the integral of fF in (4). This gives integrals of / cos nx as well 
as f sin nx, just as in Euler’s formulas, Sec. 11.1, for and b n (each multiplied by A n 
or Hence 

J fF dx — 7r(2A 0 <Zo + ^i a i H" * * * + ^N a N + + • • • -f B^b^). 

“7 7 

With these expressions, (4) becomes 


(5) 


r » T N 

E = I f 2 dx- 2 tt 2 A 0 a 0 + 2 ( An*n + B n b n ) 

-7r L n-1 

• 2 (a „ 2 + O I • 

n=l J 


2 A ft 2 + 



Fig. 275. Error of approximation 


X 
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We now take A n = a n and B n = b n in (2). Then in (5) the second line cancels half of the 
integral-free expression in the first line. Hence for this choice of the coefficients of F the 
square error, call it £*, is 


( 6 ) 



JV 

f z dx—n 

2a 0 2 + 2 («n 2 + b n Z ) 

— IT 

71=1 


We finally subtract (6) from (5). Then the integrals drop out and we get terms 
A n 2 - 2 A n a n + a 2 = ( A n - a n ) 2 and similar terms ( B n - b n ) 2 : 


E - E* = 7T 


N 


2 (A 0 - a 0 f + 2 [(A n - a n f + {B n - b n f] 


n= 1 


Since the sum of squares of real numbers on the right cannot be negative, 


£ - £* ^ 0, thus E ^ £*, 


and £ = £* if and only if A 0 = a 0 , • * • , B N = b N . This proves the following fundamental 
minimum property of the partial sums of Fourier series. 


THEOREM 1 


Minimum Square Error 

The square error of F in (2) (with fixed N) relative to f on the interval — tt ^ x ^ 7 r 
is minimum if and only if the coefficients of F in (2) are the Fourier coefficients of 
f. This minimum value £* is given by (6). 


From (6) we see that £* cannot increase as N increases, but may decrease. Hence with 
increasing N the partial sums of the Fourier series of f yield better and better 
approximations to /, considered from the viewpoint of the square error. 

Since £* ^ 0 and (6) holds for every N, we obtain from (6) the important Bessel’s 
inequality 


( 7 ) 



dx 


for the Fourier coefficients of any function / for which integral on the right exists. (For 
F. W. Bessel see Sec. 5.5.) 

It can be shown (see [Cl 2] in App. 1) that for such a function /, Parseval’s theorem 
holds; that is, formula (7) holds with the equality sign, so that it becomes Parseval’s 
identity 4 


( 8 ) 


2<J 0 2 + 2 («n 2 + 
n—1 



dx. 


4 MARC ANTOINE PARSEVAL ( 1755-1 836), French matliemc'tfician. A physical interpretation of the identity 
follows in the next section. 
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EXAMPLE 1 Minimum Square Error for the Sawtooth Wave 

Compute the minimum square error E* of F(.x) with N = 1, 2, • • • . 10, 20 t — , 100 and 1000 relative to 

/( x) = x + 7r ( — ir < x < tt) 


on Uie interval — tt^x%, tt. 

1 1 (-l) w+1 

Solution . F(x) - tt + 2 (sin .v - — sin 2* + — sin 3.x — !-••• + — sin Nx) by Example 3 in 

Sec. 1 1.3. From this and (6), 2 3 N 

E* = J (x + 7 r) 2 dx — tt^I't? + 42 ~ j * 

Numeric values are: 



Fig. 276. F with 
N = 20 in Example 1 


N 

E * 

N 

E* 

N 

E* 

N 

E* 

1 

8.1045 

6 

1.9295 

20 

0.6129 

70 

0.1782 

2 

4.9629 

7 

1.6730 

30 

0.4120 

80 

0.1561 

3 

3.5666 

8 

1.4767 

40 

0.3103 

90 

0.1389 

4 

2.7812 

9 

1.3216 

50 

0.2488 

100 

0.1250 

5 

2.2786 

10 

1.1959 

60 

0.2077 

1000 

0.0126 


F = S lt S 2 , 5 3 are shown in Fig. 266 in Sec. 1 1.3, and F = S 20 is shown in Fig. 276. Although \f(x) — FX*)! 
is large at ±7r (how large?), where f is discontinuous, F approximates / quite well on the whole interval, except 
near ±7r, where “waves” remain owing to the Gibbs phenomenon (see CAS Experiment 20 in Problem Set 
11 . 2 ). 

Can you think of functions / for which E* decreases more quickly with increasing JV? H 


This is the end of our discussion of Fourier series, which has emphasized the practical 
aspects of these series, as needed in applications. In the last three sections of this chapter 
we show how ideas and techniques in Fourier series can be extended to nonperiodic 
functions. 





1-9 


MINIMUM SQUARE ERROR 


Find the trigonometric polynomial F(x) of the form (2) for 
which the square error with respect to the given f(x) on the 
interval —tt ^ .v ^ tt is minimum, and compute the 
minimum value for N = 1, 2, • * * , 5 (or also for larger 
values if you have a CAS). 

1. f(x) = x ( — tt < x < tt) 


2. f{x) = x 2 ( — tt < x < tt) 

3. f(x) = 1*1 (-77 < * < 77) 

4. /(*) = A * 3 ( — 77 < X < 77) 

5. f(x) = | sin A'| (“77 < * < 77) 

6. /(*) = (“77 < * < tt) 


7. fix) = 



if 

if 


“77 < * < 0 
0 < * < 77 


[x if “577 < X < 577 

8 - fix) = 

lo if 577 < X < §77 

9 . /(*) = *(* + 77) if -77 < * < 0 , /(*) = x(—x + 7 7) 
if 0 < x < TT 

10. CAS EXPERIMENT. Size and Decrease of E*. 
Compare the size of the minimum square error £* for 
functions of your choice. Find experimentally the 
factors on which the decrease of E* with N depends. 
For each function considered find the smallest N such 
that E* < 0.1. 

11. (Monotonicity) Show that the minimum square error 
(6) is a monotone decreasing function of N. How can 
you use this in practice? 
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PARSEVAL’S IDENTITY 

Using ParsevaFs identity, prove that the series have the 
indicated sums. Compute the first few partial sums to see 
that the convergence is rapid. 

12 . l + ^ r + -^- + ^ 5 - + -- - = -|- = 1.01467 8032 
(Use Prob. 15 in Sec. 11.1.) 

11 tt 2 

13 . 1 . + +••• = — = 1.233700550 

(Use Prob. 13 in Sec. 1 1.1.) 


14 I 2 • 3 2 + 3 2 • 5 2 + 5 2 • 7 2 + ' " 

7T 2 1 

= — - - = 0.11685 0275 
16 2 

(Use Prob. 5, this set.) 

15. 1 + “4 + ~4 + • * • = = 1.082323234 

(Use Prob. 21 in Sec. 11.1.) 

16- l + ^ + '^6 + ^' , "" = 960 = 1 00,44 ? 0 78 

(Use Prob. 9, this set.) 


11.7 Fourier Integral 

Fourier series are powerful tools for problems involving functions that are periodic or are of 
interest on a finite interval only. Sections 11.3 and 1 1 .5 first illustrated this, and various further 
applications follow in Chap. 12. Since, of course, many problems involve functions that are 
nonperiodic and are of interest on the whole x-axis, we ask what can be done to extend the 
method of Fourier series to such functions. This idea will lead to ‘‘Fourier integrals.” 

In Example 1 we start from a special function f L of period 2 L and see what happens 
to its Fourier series if we let L — » cc. Then we do the same for an arbitrary function f L 
of period 2 L. This will motivate and suggest the main result of this section, which is an 
integral representation given in Theorem 1 (below). 


EXAMPLE 1 


Rectangular Wave 

Consider the periodic rectangular wave f L (x) of period 2L > 2 given by 

f0 if -LC.v<- 

ftSA = ' 


if 

if 


l < .v < L . 


The left part of Fig. 277 shows this function for 2L = 4, 8, 16 as well as the nonperiodic function /(.v), which 
we obtain from f L if we let L — > cc, 


Six) = lim f L { x) = 

L— ce 


I 

.0 


if — 1 <x < 1 
otherwise. 


We now explore what happens to the Fourier coefficients of f L as L increases. Since f Tj is even, b n = 0 for 
all n. For a n the Euler formulas (6). Sec. 1 1.2, give 


«o 


If 1 ! If 1 H7TX 2 f 1 MTX 2 sil 

= u J_ * = 7 • “ n = I ir T * = 7 J 0 cos — dx = 7 “ 


sin {mr/L) 


httIL 


This sequence of Fourier coefficients is called the amplitude spectrum of f l because |a n | is the maximum 
amplitude of the wave a n cos (mrxfL). Figure 277 shows this spectrum for the periods 2 L = 4, 8, 16. We see 
that for increasing L these amplitudes become more and more dense on the positive u^-axis, where vv n = mrlL. 
Indeed, for 2 L = 4, 8, 16 we have 1, 3, 7 amplitudes per “half-wave” of the function (2 sin w n )t(Lw n ) (dashed 
in the figure). Hence for 2 L = 2 k we have 2* -1 — I amplitudes per half-wave, so that these amplitudes will 
eventually be everywhere dense on the positive it' n -axis (and will decrease to zero). 

The outcome of this example gives an intuitive impression of what about to expect if we turn from our special 
function to an arbitrary one, as we shall do next. ■ 
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Waveform fAx) 


fAx) 


— 1 1 — 

rn 

i—i—i 1 

-2 ( 

1 1 1 1 ! 

) 2 

-J 1 1 1 1 1 

X 

rn m 

2 L 
/ l m 

, r 

= 4 

n 


-4 C 

) 4 

X 


!"<= 2 L 

= 8— H 



rj* 



1 

, r 

n 

. 1 — 1 

-8 

( 

n r 

) 

8 * 

r 6 

ZLj ■ 

= lb 

*1 


m 




r 

□ 



-1 0 1 



n = 4 

S' 


n = 20 


n = 12 


n = 28 




Fig. 277. Waveforms and amplitude spectra in Example 1 


From Fourier Series to Fourier Integral 

We now consider any periodic function f L (x) of period 2 L that can be represented by a 
Fourier series 

OO 

^ J17T 

/ lW = + 2j (fln cos w n x + b n sin w n x\ w n = — 

n=l L 


and find out what happens if we let L-> ». Together with Example 1 the present calculation 
will suggest that we should expect an integral (instead of a series) involving cos wx and 
sin wx with w no longer restricted to integer multiples w = w n = nir/L of 7 r/L but taking 
all values. We shall also see what form such an integral might have. 

If we insert a n and b n from the Euler formulas (6), Sec. 1 1.2, and denote the variable 
of integration by v 7 the Fourier series of f L (x) becomes 


1 r L 1 I” 

fd x ) = T7 I frfv) dv + — 2 cos ™n x I fdv) COS w n v dv 

21 -*• L n-1 L -L 

+ sin w n x J fdv) sin w n v dv J . 


Aw = w n+1 - w, 


(ft + 1)7T 

L 


MT 

~L 


7T 

T ‘ 


We now set 
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Then ML = Awhr, and we may write the Fourier series in the form 


(1) fd x ) = TT fdP) dv + — 2 ( cos *v) Aw fd») cos w n v dv 
2L J- L -tt n=1 |_ J- L 

+ (sin w w x) Aw I / L (u) sin w n v dv 

J -L 


This representation is valid for any fixed L, arbitrarily large, but finite. 
We now let L —* « and assume that the resulting nonperiodic function 


f(x) = lim f L M 

L—»zo 



is absolutely integrable on the jc-axis; that is, the following (finite!) limits exist: 

(2) lim f |/(jc)| dx 4 lim f |/(.r)| dx ( written f |/(-v)| dx 

« J a V \ J -* 


Then ML — » 0, and the value of the first term on the right side of (1) approaches zero. 
Also A w = 7 r/L — > 0 and it seems plausible that the infinite series in (1) becomes an 
integral from 0 to <*>, which represents /(*), namely. 


1 d x f r x r°° 1 

(3) f(x) = — I cos wx f(v) cos wv dv 4- sin wx I f(v) sin wv dv dw . 

77 •'o L oc J 

If we introduce the notations 

1 r 00 lr 00 

(4) A(w) = — I f(v) cos wv dv, B(\v) = — f(v) sin wv dv 

TT TT 


we can write this in the form 


(5) f(x ) = [A(w) cos wx 4 B(w) sin wx] dw. 

J o 

This is called a representation of fix) by a Fourier integral. 

It is clear that our naive approach merely suggests the representation (5), but by no 
means establishes it; in fact, the limit of the series in (1) as Aw approaches zero is not 
the definition of the integral (3). Sufficient conditions for the validity of (5) are as follows. 


THEOREM 1 


Fourier Integral 

If fix) is piecewise continuous (see Sec. 6.1) in eveiy finite interval and has a 
right-hand derivative and a left-hand derivative at eveiy point (see Sec 11.1) and 
if the integral (2) exists, then fix) can be represented by a Fourier integral (5) with 
A and B given by (4). At a point where fix) is discontinuous the value of the Fourier 
integral equals the average of the left- and right-hand limits of fix) at that point 
(see Sec. 11.1). (Proof in Ref. [Cl 2]; see App. 1.) 
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EXAMPLE 2 


Applications of Fourier Integrals 

The main application of Fourier integrals is in solving ODEs and PDEs, as we shall see 
for PDEs in Sec. 12.6. However, we can also use Fourier integrals in integration and in 
discussing functions defined by integrals, as the next examples (2 and 3) illustrate. 


Single Pulse, Sine Integral 

Find the Fourier integral representation of the function 

fl if |.v| < 1 

m = 

10 if |a| > i 


(Fig. 278). 



Fig. 278. Example 2 


x 


Solution. From (4) we obtain 

i r 

A(w) = - 

7 T 


f(v) cos wv dv 

> 

B(w) = 


■ if, 

«• J -l 

1 f* ‘ 

- U' 


cos wv dv = 


l 

-i 


sin wv dv = 0 


and (5) gives the answer 

( 6 ) 



J f cos wx sin w 

0 w 


2 sin w 

7TW 


The average of the left- and right-hand limits of /(*) at * = I is equal to (1 4- 0)/2 t that is, 1/2. 
Furthermore, from (6) and Theorem 1 we obtain (multiply by tt! 2) 



00 

'it/2 

if 

0£x< 1. 


f cos wx sin w . 

tt/4 

if 

.v= 1 . 

(7) 

dw = 

J 0 w 

. o 

if 

*> 1 . 



We mention that this integral is called Dirichlet’s discontinous factor. (For P. L. Dirichlet see Sec. 10.8.) 
The case x = 0 is of particular interest. If x = 0, then (7) gives 


m 



TT 

7 ' 


We see that this integral is the limit of the so-called sine integral 


( 8 ) 


Si(«) = 



dw 


as u — > oc. The graphs of Si(«) and of the integrand are shown in Fig. 279. 

Tn the case of a Fourier series the graphs of the partial sums are approximation curves of the curve of the 
periodic function represented by the series. Similarly, in the case of the Fourier integral (5), approximations are 
obtained by replacing «> by numbers a. Hence the integral 


(9) 



cos w* sin w 
w 


dw 


approximates the right side in (6) and therefore fix). 
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Figure 280 shows oscillations near the points of discontinuity of /(a). We might expect that these oscillations 
disappear as a approaches infinity. But this is not true; with increasing a , they are shifted closer to the points 
x = ± 1 . This unexpected behavior, which also occurs in connection with Fourier series, is known as the Gibbs 
phenomenon. (See also Problem Set 1 1.2.) We can explain it by representing (9) in terms of sine integrals as 
follows. Using (1 1) in App. A3. 1, we have 

J a ja a 

cos wx sin w If sin (w + tv.r) 1 f sin (w — wx) 

dw - — I dw H I dw. 

0 W 7T J 0 w tt Jq w 

In the first integral on the right we set w + wx = 1. Then dw/w = dtft, and 0 ^ w ^ a corresponds to 
0 = / = (a + 1) a. In the last integral we set w — wx — —t. Then dw/w = dt/t , and 0 ^ w ^ a corresponds to 
0 = f = (.v — l)n. Since sin (-/) = -sin t, we thus obtain 

2 I cos iv.v sm w 
7 T Jq W 

From this and (8) we see that our integral (9) equals 

— Si(a[jc -1- 1]) - — Si (a[x - 1]) 

77 77 

and the oscillations in Fig. 280 result from those in Fig, 279. The increase of a amounts to a transformation 
of the scale on the axis and causes the shift of the oscillations (the waves) toward the points of discontinuity 
- I and 1. M 


dw 


. r (x+l)a . . . r (a 

= ±r “»i*_±r 

77 J n t TT Jt\ 


sin / 


dt. 



y 

r 

\ 

i 

o = 8 

l !•- A l 

y 

w 

i 

a=16 

y 

|/Vw 

wV\jj 

a = 32 

A*..l 

-2 V -: 

L 0 

1^2* -2-] 

l 0 

] 

L v 2x -2 -1 Ol 

] 

2x 


Fig. 280. The integral (9) for a = 8, 16, and 32 
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Fourier Cosine Integral and Fourier Sine Integral 

For an even or odd function the Fourier integral becomes simpler. Just as in the case of 
Fourier series (Sec. 11.3), this is of practical interest in saving work and avoiding errors. 
The simplifications follow immediately from the formulas just obtained. 

Indeed, if /(a) is an even function, then B(w) = 0 in (4) and 


(10) A(w) = — f f(v) cos wv dv . 

7 t J o 

The Fourier integral (5) then reduces to the Fourier cosine integral 

(11) f(x ) = I A(w) cos wx dw 

J o 

Similarly, if /(a) is odd, then in (4) we have A(w) = 0 and 

2 r 00 

(12) B(w) = — f(v) sin wv dv. 

7 r J o 

The Fourier integral (5) then reduces to the Fourier sine integral 

(13) /(a) = [ B(w ) sin wx dw 

•Jn 


(/ even). 


t f odd). 


Evaluation of Integrals 

Earlier in this section we pointed out that the main application of the Fourier integral is 
in differential equations but that Fourier integral representations also help in evaluating 
certain integrals. To see this, we show the method for an important case, the Laplace 
integrals. 


EXAMPLE 3 Laplace Integrals 

We shall derive the Fourier cosine and Fourier sine integrals of /(a) — e~ ,ex , where .v > 0 and k > 0 (Fig. 281), 
The result will be used to evaluate the so-called Laplace integrals. 


Solution . (a) From (10) we have A(w) 


2_ 

IT 


J e ku cos wv dv. Now. by integration by parts, 
o 



cos wv dv 


k 

k 2 + u- 2 



w 

— sin wv + cos wv 


)■ 


Fig. 281. f[x) in 
Example 3 


If v = 0. the expression on the right equals -k/(k 2 + w 2 ). If v approaches infinity, that expression approaches 
zero because of the exponential factor. Thus 


(14) 


A(w) = 


2khr 


By substituting this into (11) we thus obtain the Fourier cosine integral representation 

„ -Irr 2* COS U'A* 

/(.V) = e kx = — -s 5- dw 

Jq k + w 


(a* > 0, k> 0). 
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From this representation we see that 


cos wx 7 r 

~o 5* dw = — e~ 


I ,2 , 2 “ ,v "" o/. e 

•'o k + w 2 A: 

2 f 00 

(b) Similarly, from (12) we have Z?(w) = — I sin wu <tfy. By integration by parts, 

7T J 0 

( -kv . , u ' -kv l k - \ 

e sin wu dv = - —5 5- e | — sin wu + cos wu I . 

J k 2 + w 2 \ »v / 


(.v > 0, k > 0). 


sin wu + cos ivy 


This equals — w/(k 2 + w 2 ) if v = 0. and approaches 0 as v — > »>. Thus 


= . 2,2 • 

A: + iv 


From (13) we thus obtain the Fourier sine integral representation 

OC 

. 2 f iv sin iv.v 

/(•V) = = - J 72—2 

77 k + W 


From this we see that 


J 0 * 2 + >V 2 


dw — — e~ 


The integrals (15) and (17) are called the Laplace integrals. 


(a > 0, k> 0). 


-grR^BrLEM-S:E^I 


Mil EVALUATION OF INTEGRALS 

Show that the given integral represents the indicated 
function. Hint . Use (5), (1 1), or (13); the integral tells you 
which one, and its value tells you what function to consider. 
(Show the details of your work.) 


cos a w + w sin aw 
1 + w 2 


0 

if 

x < 0 

77/2 

if 

x = 0 

7 TC~ X 

if 

x > 0 


sm w — w cos w 


sin aw dw 


ttx/2 if 0 < a; < 1 

tt/4 'f x = 1 

0 if x > 1 


r 50 COS A 

3 * J 0 l + t 


CW 77 

— 2 dw - — e x if a > 0 
w 2 


J f sm w 

cos aw dw = tt!4 if 

0 w 

. 0 if 


7t/ 2 if 0 = a < 1 

tt/ 4 if a = 1 

0 if a > 1 


cos (ttw/2) 


COS AW dw 


J f sin 7 

0 T 


7 rw sin aw 


§ COS A if 0 < |a| < 77/2 

0 if |a| ^ 77/2 

| f sin a if 0 = a = 77 

1 0 if a > 77 


7-12 1 FOURIER COSINE INTEGRAL 
REPRESENTATIONS 


Represent /(a) as an integral (11). 

1 if 0 < a < a 

7. /(a) = 

.0 it a > a 
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8. fix) 


x 2 if 0 < x < a 

0 if x > a 


9. f{x) 


10. fix) 


11. fix) 


x if 0 < * < 1 

.0 if x > 1 

r a/ 2 if 0 < * < 1 

« 1 - x/2 if 1 < x < 2 

w 0 if x > 2 

'sin x if 0 < x < it 

0 if x > 77 


12. fix) 


e x if 0 < x < a 

0 if x > a 


13. CAS EXPERIMENT. Approximate Fourier Cosine 
Integrals. Graph the integrals in Prob. 7, 9, and 11 as 
functions of a. Graph approximations obtained by 
replacing «> with finite upper limits of your choice. 
Compare the quality of the approximations. Write a 
short report on your empirical results and observations. 


14-19 


FOURIER SINE INTEGRAL 
REPRESENTATIONS 


Represent fix) as an integral (13). 


16. f(x) 


17. fix) 

18. fix) 


19. fix) 


fl - x 2 if 0 < x < 1 

l 0 if A' > 1 

f 77 “ X if 0 < A < 7T 

l 0 if A > IT 

( COS A if 0 < A < 77 

l 0 if A > 7 T 

(a — x if 0 < a < <z 

l 0 if x > a 


20. PROJECT. Properties of Fourier Integrals 
(a) Fourier cosine integral. Show that (11) implies 

1 r ( w \ 

(al) fiax) = — I Al — I cos xw dw 
a \ a ) 

ia > 0) iScale change) 

(a2) a/(a) = I J9*(w») sin xw dw , 

J o 


B* = - 


dA_ 
dw 9 


A as in (10) 


(a3) 


* 2 /(a) = f A*(iv) 


cos xw dw, 


A* = 


d^A 

dw 2 ‘ 


14. f{x) = 


c 


if 0 < a < a 

if a > a 


{ sin a if 
0 if 


0 < A < 7 T 

A > 77 


(b) Solve Prob. 8 by applying (a3) to the result of 
Prob. 7. 

(c) Verify (a2) for fix) = 1 if 0 < a < a and 
f(x) = 0 if a > a. 

(d) Fourier sine integral. Find formulas for the 
Fourier sine integral similar to those in (a). 


11.8 Fourier Cosine and Sine Transforms 

An integral transform is a transformation in the form of an integral that produces from 
given functions new functions depending on a different variable. These transformations 
are of interest mainly as tools for solving ODEs, PDEs, and integral equations, and they 
often also help in handling and applying special functions. The Laplace transform 
(Chap. 6) is of this kind and is by far the most important integral transform in 
engineering. 

The next in order of importance are Fourier transforms. We shall see that these 
transforms can be obtained from the Fourier integral in Sec. 1 1 .7 in a rather simple fashion. 
In this section we consider two of them, which are real, and in the next section a third 
one that is complex. 
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Fourier Cosine Transform 

For an even function /( x), the Fourier integral is the Fourier cosine integral 

oc 2 cc 

(1) (a) fix) = A(w) cos wxdwy where (b) A(w) = — /(u) cos wu du 

•'o tt ■'0 

[see (10), (11), Sec. 11.7]. We now set A(yv) = V2/ir / c (w), where c suggests “cosine.” 
Then from (lb), writing v = x, we have 


( 2 ) 



/ c (vv) = — I / (v) COS W.Y dx 

77 Jq 


and from (la), 

(3) 



/(*) = , / — I /c(w) cos wx dw. 

7 T J 0 


ATTENTION! In (2) we integrate with respect to x and in (3) with respect to vi\ Formula 
(2) gives from f(x) a new function f c (w), called the Fourier cosine transform of f(x). 
Formula (3) gives us back f(x) from / c (w), and we therefore call f(x) the inverse Fourier 
cosine transform of f c (w). 

The process of obtaining the transform f c from a given f is also called the Fourier 
cosine transform or the Fourier cosine transfoim method. 


Fourier Sine Transform 

Similarly, for an odd function f(x\ the Fourier integral is the Fourier sine integral [see 
(12), (13), Sec. 11.7] 


(4) (a) f{x) - B(w) sin wx dw , where 

J o 

We now set Z?(vi>) = V2/7T f s (w) 9 where s suggests 
we have 


(b) B(w) = — f(v) sin wv dv. 

77 J o 

“sine.” Then from (4b), writing v = x , 


(5) 



f s i w ) = /— I /(*) sin w* dx. 


77 J Q 


This is called the Fourier sine transform of f(x). Similarly, from (4a) we have 


( 6 ) 



fix) = / — f s iw) sin via* dw. 

77 J Q 


This is called the inverse Fourier sine transform of f s (w). The process of obtaining f s (w) 
from f(x) is also called the Fourier sine transform or the Fourier sine transform method. 
Other notations are 


®c(f) = U = fs 


and S’c 1 and 1 for the inverses of and cF s , respectively. 
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EXAMPLE 1 


k 


x = a x 
Fig. 282. f(x) in 
Example 1 


EXAMPLE 2 


Fourier Cosine and Fourier Sine Transforms 

Find the Fourier cosine and Fourier sine transforms of the function 

[k if 0 < x < a 

fix) = { (Fig. 282). 

|0 if x > a 

Solution. From the definitions ( 2 ) and (5) we obtain by integration 

Uw) = J^ k l cos wxdx - Jl 1 k (^r) 

f s M = *> dx = k (- 7 ") • 

This agrees with formulas 1 in the first two tables in Sec. 11.10 (where k = 1). 

Note that for fix) = k = const (0 < x < »). these transforms do not exist. (Why?) ■ 

Fourier Cosine Transform of the Exponential Function 

Find %(e~ x ). 

Solution. By integration by parts and recursion, 

\y C _ nr e~ x r 

< & c (e~ x ) - / — I e x cos wx dx = — 5 - (-cos wx 4* w sin »wc) 

\f tt J 0 V w 1 + w I 0 

This agrees with formula 3 in Table I T Sec. 11.10, with a = 1 . See also the next example. B 

What did we do to introduce the two integral transforms under consideration? Actually 
not much: We changed the notations A and B to get a “symmetric” distribution of the 
constant 2 /tt in the original formulas (10)— (13), Sec. 11.7. This redistribution is a standard 
convenience, but it is not essential. One could do without it. 

What have we gained? We show next that these transforms have operational properties 
that permit them to convert differentiations into algebraic operations (just as the Laplace 
transform does). This is the key to their application in solving differential equations. 


\Z2hr 
1 + w 2 


Linearity, Transforms of Derivatives 

If fix) is absolutely integrable (see Sec. 11.7) on the positive A-axis and piecewise 
continuous (see Sec. 6.1) on every finite interval, then the Fourier cosine and sine 
transforms of f exist. 

Furthermore, if f and g have Fourier cosine and sine transforms, so does af + bg for 
any constants a and b> and by (2), 



W + bg) = / — I [a fix) + bg(x)] cos wx dx 

TT J Q 



— a / — I fix) cos wx dx + b 

7T Jq 


/ir 

y 7T J o 


g{x) cos wx dx. 


The right side is a& c (f) 4* b& c (g). Similarly for by (5). This shows that the Fourier 
cosine and sine transforms are linear operations, 


( 7 ) 


(a) 9Jflf + bg) = a® c (f) + b& c (g), 

(b) ® s (af + bg) = a&Jif) + b® s (g). 
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THEOREM 1 


PROOF 


Cosine and Sine Transforms of Derivatives 

Let fix) be continuous and absolutely integrable on the x-axis, let fix) be piecewise 
continuous on every finite interval, and let let fix) — » 0 as x -* Then 


( 8 ) 


(a) 9 c [f'ix)) = wS? s {/Cc)} - m, 

(b) ®Af(x)) = -w& e {f(x)). 


This follows from the definitions by integration by parts, namely, 
® c [f(x)} = J— [ f'ix) cos wxdx 

y 7T J o 


nr r 

= / — fix) cos wx + w I fix) sin wx dx 
V tr L o J o 


= - J-m + w® s {f(x))> 

7T 


and similarly, 


■M 


ix) sin wx dx 


fix) sin wx 

= 0 - w3? c {/Wl. 


— w J fix) cos wx dxj 


Formula (8a) with /' instead of f gives (when f satisfy the respective assumptions 
for /, f in Theorem 1) 


SF c {/"(*)} = w® s {f(x)} 


hence by (8b) 
(9a) 

Similarly, 

(9b) 


= -vw 2 9? c {/W} 


- 


W"(*)} = -w 2 SF s {/(*)} + J- Wfi0). 


A basic application of (9) to PDEs will be given in Sec. 12.6. For the time being we 
show how (9) can be used for deriving transforms. 
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EXAMPLE 3 An Application of the Operational Formula (9) 

Find the Fourier cosine transform & c (e~ ax ) of /(*) = where a > 0. 
Solution . By differentiation, {e~ ax ) n — c?‘e~ (iX \ thus 

a 2 f(x) = f\x). 

From this, (9a), and the linearity (7a), 

a z 9 c (f) = 9jf) 


Hence 


= ~w 2 ® c (f) - - f 

y 77 

(a 2 + w 2 )9 c (f) = aVIhr. 


( 0 ) 


The answer is (see Table I, Sec. 1 1.10) 


*<<'"">- ^{777) 


Tables of Fourier cosine and sine transforms are included in Sec. 11.10. 





|l-IO| FOURIER COSINE TRANSFORM 

1. Let f(x ) = —1 if 0 < x < 1, f(x) = lifl<*<2, 
f{x) = 0 if * > 2. Find f c (w). 

2. Let fix) = x if 0 < x < k> fix) - 0 if x > k. Find 

f c M- 

3. Derive formula 3 in Table 1 of Sec. 1 1. 10 by integration. 

4. Find the inverse Fourier cosine transform fix) from the 
answer to Prob. 1. Hint Use Prob. 4 in Sec. 11.7. 

5. Obtain SFj *(1/(1 + w 2 )) from Prob. 3 in Sec. 11.7. 

6. Obtain ^\e~ w ) by integration. 

7. Find 8F C ((1 — x 2 ) -1 cos (ttjc/2)). Hint Use Prob. 5 in 
Sec. 11.7. 

8. Let fix) = x 2 if 0 < x < 3 and 0 if x > 1. Find & c if). 

9. Does the Fourier cosine transform of x" 1 sin* exist? 
Of jc" 1 cos jc? Give reasons. 

10. fix) = 1 (0 < * < 00) has no Fourier cosine or sine 
transform. Give reasons. 

FOURIER SINE TRANSFORM 

11. Find SF s 0 _77:c ) by integration. 


12. Find the answer to Prob. 11 from (9b). 

13. Obtain formula 8 in Table II of Sec. 11.11 from (8b) 
and a suitable formula in Table I. 

14. Let fix) = sin*if0<*<77 and 0 if * > 77. Find 
SF s (/). Compare with Prob. 6 in Sec. 11.7. Comment. 

15. In Table II of Sec. 11.10 obtain formula 2 from formula 
4, using Til) ~ ^77 [(30) in App. 3.1]. 

16. Show that ^ix~ m ) = w" 1/2 by setting wx = t 2 and 
using S(°°) = V77/8 in (38) of App. 3.1. 

17. Obtain ^ s (^ _ctx ) from (8a) and formula 3 in Table I of 
Sec. 11.10. 

18. Show that ® s ix~ m ) = 2w m . Hint Set wx = f 2 , 
integrate by parts, and use C(°°) = Vn? 8 in (38) of 
App. 3.1. 

19. (Scale change) Using the notation of (5), show that 
fiax) has the Fourier sine transform (1 /a)f s iw/a). 

20. WRITING PROJECT. Obtaining Fourier Cosine 
and Sine Transforms. Write a short report on ways 
of obtaining these transforms, giving illustrations with 
examples of your own. 
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11.9 Fourier Transform. 

Discrete and Fast Fourier Transforms 

The two transforms in the last section are real. We now consider a third one, called the 
Fourier transform, which is complex. We shall obtain this transform from the complex 
Fourier integral, which we explain first. 


Complex Form of the Fourier Integral 

The (real) Fourier integral is [see (4), (5), Sec. 11.7] 


where 


f(x) = I [A(w) cos wx + B(w) sin wx] dw 
J o 

1 r 00 1 r 00 

A(w) = — f(v) cos wv dv. B(w) = — f(v) sin dv. 

77 J -x 77 


77 J -oo 


Substituting A and B into the integral for /, we have 


f(x) = — f f f(v) Lcos wv cos wx + sin wv sin wx] dv dw. 

77 J 0 — cc 


By the addition formula for the cosine [(6) in App. A3.1] the expression in the brackets 
[• • •] equals cos (wu — wx) or, since the cosine is even, cos (wx — wu). We thus obtain 

(1*) fix) = ~ /fa) cos ( wx ~ wv ) dv\^ dw. 

The integral in brackets is an even function of w, call it F(w), because cos (wx — vvu) is 
an even function of w, the function / does not depend on vi>, and we integrate with respect 
to v (not w). Hence the integral of F(w) from w = 0 to «> is 1/2 times the integral of F(w) 
from — <» to oo. Thus (note the change of the integration limit!) 

(1) f(x) = ^ ^ v ) cos ( wx “ dv J dw. 


We claim that the integral of the form (1) with sin instead of cos is zero: 

(2) — ^ j £ J f(v) sin (wx — vvu) dv J dw = 0. 

This is true since sin (wx — wv) is an odd function of w, which makes the integral in 
brackets an odd function of w, call it G(vr). Hence the integral of G(w) from — oo to oo }$ 
zero, as claimed. 

We now take the integrand of (1) plus i (= V^T) times the integrand of (2) and use 
the Euler formula [(1 1) in Sec. 2.2] 


0) 


e w = cos x + i sin a*. 
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Taking wx — wu instead of x in (3) and multiplying by f(v) gives 

f(v) cos (wx - wv) 4- if(v) sin (wx — wv) = f(v)e Uwx ~ wv \ 

Hence the result of adding (1) plus i times (2). called the complex Fourier integral, is 

J oc oo 

(4) f(x) = — I f f(v)e iwix - u) dv dw (i = V^T). 

27T ■'-oo ■'-oo 

It is now only a very short step to our present goal, the Fourier transform. 

Fourier Transform and Its Inverse 

Writing the exponential function in (4) as a product of exponential functions, we have 

<5> m - vfe £ [ vb *] »■ 

The expression in brackets is a function of w , is denoted by /(w), and is called the Fourier 
transform of f; writing v = x, we have 

(6) f(w) = -j= f f(x)e~ iw *dx. 

V2tt ■'-oc 

With this, (5) becomes 

(7) fix) = -}= [ fMe iwx dw 

v2tt J -c c 

and is called the inverse Fourier transform of f(w). 

Another notation for the Fourier transform is 

f = m 

so that 

f = 

Tlie process of obtaining the Fourier transform 3^(f) = / from a given f is also called 
the Fourier transform or the Fourier transform method. 

Conditions sufficient for the existence of the Fourier transform (involving concepts 
defined in Secs. 6.1 and 1 1.7) are as follows, as we state without proof. 


Existence of the Fourier Transform 

Iff(x) is absolutely integrable on the x-axis and piecewise continuous on evety finite 
interval then the Fourier transform f(w) of f(x) given by (6) exists . 
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EXAMPLE 1 Fourier Transform 

Find the Fourier transform of f(x) = 1 if |jc| < 1 and /( x) = 0 otherwise. 
Solution . Using (6) and integrating, we obtain 


m 


= _J_ [' e -iu 
L* 


: dx = 


1 


—iwx 


1 


(e’™ _ € w> )m 


V27 t ~iw I _i -iwV2^ 

As in (3) we have e tw = cos w + / sin w, e~ lw = cos w — i sin w, and by subtraction 

e iw - e~ ixu = 2 i sin iv. 

Substituting this in the previous formula on the right, we see that i drops out and we obtain the answer 


,77 sin w 
— 


EXAMPLE 2 Fourier Transform 

Find the Fourier transform ^(e~ ax ) of fix) = e~ ax if x > 0 and f(x) = 0 if x < 0; here a > 0. 
Solution . From the definition (6) we obtain by integration 

00 

ne-^) = J e-^e-^dx 

j e -ia+iw)x 00 j 

V2 7T “(e + *w) r=0 V27r (a + iw) 


This proves formula 5 of Table III in Sec. 1 1.10. 


Physical Interpretation: Spectrum 

The nature of the representation (7) of f(x ) becomes clear if we think of it as a superposition 
of sinusoidal oscillations of all possible frequencies, called a spectral representation. 
This name is suggested by optics, where light is such a superposition of colors 
(frequencies). In (7), the “spectral density” f(w) measures the intensity of f(x) in the 
frequency interval between w and w 4- Aw (Aw small, fixed). We claim that in connection 
with vibrations, the integral 

J l/(w)| 2 dw 

-CC 

can be interpreted as the total energy of the physical system. Hence an integral of |/(w)| 2 
from a to b gives the contribution of the frequencies w between a and b to the total energy. 

To make this plausible, we begin with a mechanical system giving a single frequency, 
namely, the harmonic oscillator (mass on a spring, Sec. 2.4) 


my" + ky = 0. 


Here we denote time t by x. Multiplication by / gives my y" + ky y = 0. By integration, 

\mv 2 + \ky 2 = E 0 = const 

where v — y is the velocity. The first term is the kinetic energy, the second the potential 
energy, and £ 0 the total energy of the system. Now a general solution is (use (3) in 
Sec. 1 1 .4 with t = x) 
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THEOREM 2 


PROOF 


y = a x cos w 0 x + b 1 sin w 0 x = w 0 2 = kfm 


where c ± = (a x — ib x )l 2, c_ x = c x = (% + ib x )/ 2. We write simply >4 = 

£ = c_ 1 *~ 224, ° x . Then y = A + B. By differentiation, v = y f = A* + B f = iw 0 (A — B). 
Substitution of u and y on the left side of the equation for E 0 gives 

E 0 = \mv 2 + \ky 2 = %m(iw 0 ) 2 (A — B) 2 + § k(A + B) 2 . 

Here w 0 2 = k/m , as just stated; hence mw 2 = &. Also i 2 = —1, so that 

E 0 = \k[-(A ~ Bf + (A + £) 2 ] = 2£A5 = = 2*|c 1 | 2 . 

Hence the energy is proportional to the square of the amplitude |cj. 

As the next step, if a more complicated system leads to a periodic solution y = f(x) 
that can be represented by a Fourier series, then instead of the single energy term |c x | 2 we 
get a series of squares |c n | 2 of Fourier coefficients c n given by (6), Sec. 1 1.4. In this case 
we have a “discrete spectrum” (or “point spectrum”) consisting of countably many 
isolated frequencies (infinitely many, in general), the corresponding |c n | 2 being the 
contributions to the total energy. 

Finally, a system whose solution can be represented by an integral (7) leads to the above 
integral for the energy, as is plausible from the cases just discussed. 


Linearity. Fourier Transform of Derivatives 

New transforms can be obtained from given ones by 


Linearity of the Fourier Transform 

The Fourier transform is a linear operation; that is, for any functions f(x) and g(x) 
whose Fourier transforms exist and any constants a and b, the Fourier transform 
of a f + bg exists , and 

(8) 9(af + bg) = a9(f) + b9(g). 


This is true because integration is a linear operation, so that (6) gives 

1 r°° 

9{af(x) + bg(x)} = ^j= J Jaf(x) + bg(x)]e~ tw * dx 

= ° vfe + b vfe 

= a9{f(x)} + b9[g(x)}. M 


In applying the Fourier transform to differential equations, the key property is that 
differentiation of functions corresponds to multiplication of transforms by iw: 
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THEOREM 3 


PROOF 


EXAMPLE 3 


Fourier Transform of the Derivative of f[x) 

Let f(x) be continuous on the x-axis and f(x) 0 as |jc| — * <». Furthermore , let 
f'(x) be absolutely integrable on the x-axis. Then 

(9) *{/'(*)} = iw9[fi x)}. 


From the definition of the Fourier transform we have 

= ~= j j'(x)e~ iwx dx. 

Integrating by parts, we obtain 

*{/'(*)} = [f(x)e- iw * 

Since f(x) — » 0 as \x\ the desired result follows, namely, 

®{f(x)} = 0 + iw®(f(x)). ■ 

Two successive applications of (9) give 

9(f) = iw9(f') = (Mf9(f). 

Since (iw) 2 = — w 2 , we have for the transform of the second derivative of / 

(10) nf(x)} = -w 2 9{f(x)l 


— (— iw ) f /C*)e twx dx 


Similarly for higher derivatives. 

An application of (10) to differential equations will be given in Sec. 12.6. For the time 
being we show how (9) can be used to derive transforms. 


Application of the Operational Formula (9) 

Find the Fourier transform of xe from Table III, Sec 11.10. 
Solution . We use (9). By formula 9 in Table III. 
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THEOREM 4 


PROOF 


Convolution 

The convolution f * g of functions f and g is defined by 

oo oc 

(11) h(x) = (f* g) Or) = [ f{p)g(x - p) dp = f f(x - p)gip) dp. 

OO OC 


The purpose is the same as in the case of Laplace transforms (Sec. 6.5): taking the 
convolution of two functions and then taking the transform of the convolution is the same 
as multiplying the transforms of these functions (and multiplying them by V2tt): 


Convolution Theorem 

Suppose that f(x) and g(x) are piecewise continuous , bounded , and absolutely 
integrable on the x-axis . Then 

(i2) ®if * g) = Vi^vismg). 


By the definition, 


1 oo OO 

m * g) = f_ x f_J(p)g(x - P) dp e-™* dx. 

An interchange of the order of integration gives 

j oc oc 

®if *g)~ —7^= f j f(p)g(x ~ P)e~ iwx dx dp. 

V27T J -oc J -zc 

Instead of x we now take x — p = q as a new variable of integration. Then x = p + q and 

| 00 00 

®(f * g) = f x j_J(p)g(q)e~ iw<p+q) dq dp. 

This double integral can be written as a product of two integrals and gives the desired 

result 

9(f * g) = t 4= J fip)e- iwp dp J g(q)e~ iwq dq 


1 

V2tt 


[V^(/)][V^3?(g)] = V2 n9(f)9(g). 


By taking the inverse Fourier transform on both sides of (12), writing / = 3F(/) and 
g — S'(g) as before, and noting that V27 r and 1/V27T in (12) and (7) cancel each other, 
we obtain 


(13) 


r - 

(/ * g)(x) = J f(w)g(w)e twx dw ; 


a formula that will help us in solving partial differential equations (Sec. 12.6). 
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Discrete Fourier Transform (DFT), 

Fast Fourier Transform (FFT) 

In using Fourier series, Fourier transforms, and trigonometric approximations (Sec. 11.6) 
we have to assume that a function /( x), to be developed or transformed, is given on some 
interval, over which we integrate in the Euler formulas, etc. Now very often a function 
/( x) is given only in terms of values at finitely many points, and one is interested in 
extending Fourier analysis to this case. The main application of such a “discrete Fourier 
analysis” concerns large amounts of equally spaced data, as they occur in 
telecommunication, time series analysis, and various simulation problems. In these 
situations, dealing with sampled values rather than with functions, we can replace the 
Fourier transform by the so-called discrete Fourier transform (DFT) as follows. 

Let f(x) be periodic, for simplicity of period 2tt. We assume that N measurements of 
f(x) are taken over the interval 0 ^ x ^ 277 at regularly spaced points 

2 t rk 

(14) Xk = k = 0, 1, 1. 


We also say that f(x) is being sampled at these points. We now want to determine a 

complex trigonometric polynomial 

(15) q(x) = 2 c n e inXk 

71 = 0 


that interpolates f(x ) at the nodes (14), that is, q( x k ) = f(x k ), written out, with f k denoting 
f(x k ). 


(16) 


N—l 

fk = f(x k ) = q(x k ) = 2 c n e inXk , 

n - 0 


A: = 0, 1, •••, N- 1. 


Hence we must determine the coefficients c 0 , • • • , c N ^i such that (16) holds. We do this 
by an idea similar to that in Sec. 11.1 for deriving the Fourier coefficients by using the 
orthogonality of the trigonometric system. Instead of integrals we now take sums. Namely, 
we multiply (16) by e~ tmXk (note the minus!) and sum over k from 0 to N — 1. Then we 
interchange the order of the two summations and insert x k from (14). This gives 


(17) 


N-l N—l N—l 

2 = 22 


N-l N-l 

V Hn—m)2rrkfN 

2u c n 2u e 


k=0 


k- 0 n - 0 


n=0 k=0 


Now 


^i(n-m)27rfc/N _ r^i(n— m)27j/Nl fc 


We donote [• • •] by r. For n = m we have r = e° = 1. The sum of these terms over k 
equals N> the number of these terms. For n ± m we have 1 and by the formula for a 
geometric sum [(6) in Sec. 15.1 with q = r and n = N - 1] 


N-l 

2 r k = 


k=Q 


1 - 
1 - r 


= 0 
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because r N = 1; indeed, since fc, m, and n are integers, 

r N = ^ (n “ w)2wfc = cos 27r*(n — m) + i sin27ri k(n — m) = 1 + 0=1. 

This shows that the right side of (17) equals c m N. Writing n for m and dividing by JV, we 
thus obtain the desired coefficient formula 

l 

(18*) c n = - X /**-*“* / fc = /(**), n = 0, 1, • • • , N - 1. 

™ k—Q 


Since computation of the c n (by the fast Fourier transform, below) involves successive 
halfing of the problem size N> it is practical to drop the factor 1 IN from c n and define the 
discrete Fourier transform of the given signal f = [/ 0 ■ • • /n-i] t t0 be the vector 
f = [f 0 • • • /iv— i] with components 

N-l 

(18) fn = Nc n = '2 f k e~ in *\ f k = f(x k ), n = 0, • • • , N - 1. 

fc =0 


This is the frequency spectrum of the signal. 

In vector notation, f = F N f, where the N X N Fourier matrix Fn - \ e n fc ] has the 
entries [given in (18)] 

(19) e nk = e~ inx * = e ~ MnklN = w 7 *, w = w N = e" 2 ^, 

where n, k = 0, • • • , N — 1. 


EXAMPLE 4 


Discrete Fourier Transform (DFT). Sample of N = 4 Values 


Let N = 4 measurements (sample values) be given. Then w = c 2iri/JV = e~" xl2 - —i and thus w nk = (-i) nk . 
Let the sample values be, say f = [0 1 4 9] T . Then by (18) and (19), 


(20) f = F 4 f = 


— 1 

3 

o 

w ° 


H-°' 


v , 1 

,v 2 

w® 

w° 

w 2 


CO 

a 

1 

o 

u- 3 

,v 6 



f = 


-1 



’o" 


14 


1 


-4 + 8/ 


4 


-6 


.9. 


.-4 - 8 i m 


From the first matrix in (20) it is easy to infer what Fjy looks like for arbitrary N, which in practice may be 
1000 or more, for reasons given below. I 


A 

From the DFT (the frequency spectrum) f — F jV f we can recreate the given signal 
f = Fn 1 f, as we shall now prove. Here F N and its complex conjugate F N = — [iv"*] satisfy 

(21a) F n F n = F n F n = Nl 


where I is the N X N unit matrix; hence F N has the inverse 



(21b) 
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We prove (21). By the multiplication rule (row 
G N = F n F n = [g jk \ in (21a) has the entries g jk = 

That is, writing W = w j w k , we prove that 

§jk = (w^w k )° 4- (wW + • • • + 

= W° + W 1 + • • • 4- W iV " 1 = 

Indeed, when j = then w k w k = (ww) k = ( e 2 ^ N e - 2 ^ N ) k = \ k = i, so that the sum 
of these N terms equals N; these are the diagonal entries of G N . Also, when j & k , then 
W =£ 1 and we have a geometric sum (whose value is given by (6) in Sec. 15.1 with 
q—W and n — N — 1 ) 

1 - W N 

W° + W 1 - f * * * + W N “ 1 = = 0 

1 — w 

because W N = (w j w k ) N = {e 27li y{e~ 27ri ) k = i j • \ k = 1. ■ 


times column) the product matrix 
Row j of ¥ n times Column k of¥ N . 

(w j w k ) N ~ x 
f 0 if j±k 

In if j = k. 


We have seen that f is the frequency spectrum of the signal f(x). Thus the components 
f n of f give a resolution of the 27r-periodic function f(x) into simple (complex) harmonics. 
Here one should use only n’s that ai’e much smaller than N/ 2, to avoid aliasing. By this we 
mean the effect caused by sampling at too few (equally spaced) points, so that, for instance, 
in a motion picture, rotating wheels appear as rotating too slowly or even in the wrong sense. 
Hence in applications, N is usually large. But this poses a problem. Eq. (18) requires 0(N) 
operations for any particular n, hence 0(N 2 ) operations for. say. all n < N/2 . Thus, already 
for 1000 sample points the straightforward calculation would involve millions of operations. 
However, this difficulty can be overcome by the so called fast Fourier transform (FFT), 
for which codes are readily available (e.g. in Maple). The FFT is a computational method 
for the DFT that needs only O(N) log 2 N operations instead of 0(N 2 ). It makes the DFT a 
practical tool for large N . Here one chooses N = 2 P (p integer) and uses the special form 
of the Fourier matrix to break down the given problem into smaller problems. For instance, 
when N = 1000, those operations are reduced by a factor 1000/log 2 1000 100. 

The breakdown produces two problems of size M = N/2. This breakdown is possible 
because for iV = 2M we have in (19) 

w N 2 = w 2M 2 = {e - 2 ™ tN ) 2 = e- 4 ^ /(2M > = e-*** = w M . 

The given vector f = [/ 0 • * ■ /jv-i] T * s split into two vectors with M components 

each, namely, f ev = [/ 0 / 2 ’ ‘ ' /jv-2] T containing the even components of f, and 

^CKl = [/i fs ■ • ■ /n-i] t containing the odd components of f. For f ev and f od we 
determine the DFTs 

a a 

/ev,2 * * * / ev.N— 2J = Fj^fey 
/ od,3 * * * /od.iV-l] = F M f od 

involving the same M X M matrix F M . From these vectors we obtain the components of 
the DFT of the given vector f by the formulas 


and 


fev = [/ 


ev,0 


fod [/od,l 


( 22 ) 


fn f ev.n "F / od.n 

(b) 

f n+M f ev,« f od.n 


n . = 0, • • • , M - 1 
n = 0, • • • , M - I. 
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EXAMPLE 5 


For N = 2 P this breakdown can be repeated p — 1 times in order to finally arrive at M2 
problems of size 2 each, so that the number of multiplications is reduced as indicated 
above. 

We show the reduction from N = 4 to M = Nil = 2 and then prove (22). 

Fast Fourier Transform (FFT). Sample of N — 4 Values 

When N — 4, then w — w N = — i as in Example 4 and M = N/2 = 2, hence w = = e” 2,7 ^ 2 — e~ 7rl = — 1. 

Consequently, 



From this and (22a) we obtain 

fo - /ev.o + »’/V 0 /od.O = Cfo + fz) + (/l + /3> = /o + /l + fz + /s 
fl = few 1 + u ’/V 1 /od,l = (/o ~ /s) ~ «/i + /s) “ /o “ Vl ~ /2 + Vs* 

Similarly, by (22b), 

/2 = /ev.O — »r JV 0 /od,0 = tfo + / 2 ) • Cfl + /s) = fo ~ fl + Jz ~ /3 

/3 = /ev.i — w JV 1 /od.i = Vo — /2) ” (“OCfi ~ /3) = /o + if i — f% ~ tfz- 
This agrees with Example 4, as can be seen by replacing 0, 1 , 4. 9 with / 0 . / j, / 2 , / 3 . H 


We prove (22). From (18) and (19) we have for the components of the DFT 

fn = 2 w N U fk • 

Splitting into two sums of M = M2 terms each gives 

Jtf-l iW-1 

fn = 2 w N kn f2k + 2 ^N 2/C+1)n /2fc-f*l* 

fc =0 fc =0 

We now use w N 2 = w M and pull out w N n from under the second sum, obtaining 

M— 1 M-l 

(23) f n = 2 w M l fev,k + W N U 2 w Af W /od,fc* 

/c=0 fe=0 

The two sums are / ev n and / od n , the components of the “half-size’' transforms F f ev and 
F Cod- 

Formula (22a) is the same as (23). In (22b) we have n + M instead of n. This causes 
a sign change in (23), namely — before the second sum because 

w n m = e~ 27riM/N = e - 27ri/2 = e~ wi = - 1 . 


This gives the minus in (22b) and completes the proof. 
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Iff 



1. (Review) Show that 1// = — e ix + e ix = 2 cos a, 
e lx — e~™ = 2 i sin a:. 


2-9 


FOURIER TRANSFORMS BY INTEGRATION 


Find the Fourier transform of /(a) (without using Table III 
in Sec. 11.10). Show the details. 


8 * /(*) = 


xe x 
. 0 


if - 1 < x <0 
otherwise 


9. fix ) = 



if -1 < a* < 0 
if 0 < a < 1 


2. /(a) 


if a < 0 (it > 0) 
0 if a > 0 


3. fix) 


r k if 0 < a < b 
.0 otherwise 


4. fix) 


e 2tx if — 1 < a < 1 
0 otherwise 


5- fix) 


& if - 1 < a < 1 
L 0 otherwise 


6- fix) 


x if -1 < a < I 
,0 otherwise 


7. fix) 


A if 0 < A < 1 

.0 otherwise 


l 0 otherwise 

OTHER METHODS 

10. Find the Fourier transform of fix) = xe~ x if x > 0 and 
0 if a < 0 from formula 5 in Table III and (9) in the 
text. Hint: Consider xe~ x and e~ x . 

11. Obtain tfie"** 12 ) from formula 9 in Table ID. 

12. Obtain formula 7 in Table III from formula 8. 

13. Obtain formula 1 in Table III from formula 2. 

14. TEAM PROJECT. Shifting, (a) Show that if fix) 
has a Fourier transform, so does fix — a ), and 
nfi a - a)} = e~* wa cF { f (a) } . 

(b) Using (a), obtain formula 1 in Table III, Sec. 11.10, 
from formula 2. 

(c) Shifting on the w-Axis. Show that if f(w) is the 
Fourier transform of /(a), then /(w — a) is the Fourier 
transform of e tax f{x). 

(d) Using (c), obtain formula 7 in Table ITT from 1 and 
formula 8 from 2. 
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11.1C Tables of Transforms 

Table I. Fourier Cosine Transforms 


See (2) in Sec. 1 1 .8. 
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Fourier Sine Transforms 

See (5) in Sec. 11.8. 
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Table I Fourier Transforms 


See (6) in Sec. 11.9. 
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,VflMW»1 


TIONS AND PROBLEMS 


1. What is a Fourier series? A Fourier sine series? A 
half-range expansion? 

2. Can a discontinuous function have a Fourier series? A 
Taylor series? Explain. 

3. Why did we start with period 2 tt? How did we proceed 
to functions of any period pi 

4. What is the trigonometric system? Its main property by 
which we obtained the Euler formulas? 

5. What do you know about the convergence of a Fourier 
series? 

6. What is the Gibbs phenomenon? 

7. What is approximation by trigonometric polynomials? 
The minimum square error? 

8. What is remarkable about the response of a vibrating 
system to an arbitrary periodic force? 

9. What do you know about the Fourier integral? Its 
applications? 

10. What is the Fourier sine transform? Give examples. 


11-20 


FOURIER SERIES 


Find the Fourier series of fix) as given over one period. 
Sketch fix). (Show the details of your work.) 


n. /to = 


12. f(x) = 



if - 1 < .v < 0 
if 0 < x < 1 

if — 7 t/2 < x < tt/2 
if 7 t/2 < a < 3 77/2 


13. /(a) = x (—2 7T < x < 27 r) 

14. f(x) = |*| (-2 < a < 2) 


15. f(x) = 


16. fix) = 



if - 1 < * < 1 
if I < a < 3 

if -1 < a < 0 
if 0 < a < 1 


17. fix) = |sin 8 7ta| (—1/8 < a < 1/8) 

18. fix) = 6 X ( ~ 7T < a < 7T> 

19. fix) = A 2 (-77/2 < A < 77/2) 

20. fix) = A (0 < A < 27 t) 


21-23 


Using the answers to suitable odd-numbered 


problems, find the sum of 


21. 1 3 + 5 


7 + - 


1 1 l 

224 1*3 + 3-5 + 5-7 


23. 1 "F 9 +* 25 “f 


24. (Parseval’s identity) Obtain the result of Prob. 23 by 
applying Parseval’s identity to Prob. 12. 

25. What are the sum of the cosine terms and the sum of 
the sine terms in a Fourier series whose sum is /(*)? 
Give two examples. 

26. (Half-range expansion) Find the half-range sine series 
of fix) = 0 if 0 < a < tt/2, fix) = 1 if 7r/2 < a < 7 r. 
Compare with Prob. 12. 

27. (Half-range cosine series) Find the half-range cosine 
series of fix) = a (0 < a < 27 r). Compare with 
Prob. 20. 


28-29 1 MINIMUM SQUARE ERROR 

Compute the minimum square errors for the trigonometric 
polynomials of degree N = 1 , • • • , 8: 

28. For fix) in Prob. 12. 

29. For fix) = a (— 77 < a < 77 ). 


30-31 


GENERAL SOLUTION 


Solve y" + <o 2 y = rit ), where \a)\ 0, I, 2, * • • , /*(/) 

is 277-periodic and: 

30. ;*(/) = r(77 2 — t 2 ) ( — 77 < t < tt) 

31. rit) = r 2 ( — 77 < / < 77) 


32-37 


FOURIER INTEGRALS AND 
TRANSFORMS 


Sketch the given function and represent it as indicated. If 
you have a CAS, graph approximate curves obtained by 
replacing 0 ° with finite limits; also look for Gibbs 
phenomena. 


32. fix) = 1 if 1 < a < 2 and 0 otherwise, by a Fourier 
integral 


33. fix) = a if 0 < a < 1 and 0 otherwise, by a Fourier 
integral 
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37. f(x) = 4 — x 2 if — 2 < x < 2, /( x) — 0 otherwise, by 
a Fourier cosine integral 

38. Find the Fourier transform of /(a) = k if 
a < x < b, /( x) = 0 otherwise. 

39. Find the Fourier cosine transform of /( x) = e~ 2x if 
x > 0, fix) = 0 if x < 0. 

40. Find %P c (e~ 2x ) and $F s (e~ 2x ) by formulas involving 
second derivatives. 



34. f(x) = 1 + x/2 if -2 < jc < 0, fix) = I - a/ 2 if 
0 < x < 2, fix) — 0 otherwise, by a Fourier cosine 
integral 

35. fix) = -1 - xI2 if -2 < a* < 0, f(x) = 1 - a/2 if 
0 < a < 2, fix) = 0 otherwise, by a Fourier sine 
integral 

36. fix) = -4 + a 2 if -2 < a < 0, fix) = 4 - a 2 if 
0 < x < 2, fix) = 0 otherwise, by a Fourier sine 
integral 


Fourier Series, Integrals, Transforms 


Fourier series concern periodic functions fix) of period p = 2 L, that is, by definition 
fix + p) = fix) for all x and some fixed p > 0; thus, fix + np) = fix) for any 
integer n. These series are of the form 

^ ( mt mr \ 

(1) fix) = a 0 + 2 j \ a n cos ~r~ x + b n sin — jc] (Sec. 11.2) 

n=l \ L f 

with coefficients, called the Fourier coefficients of fix), given by the Euler formulas 
(Sec. 11.2) 


1 r L 1 r L 

~2 ZJ./ U)dX ’ 

, 1 r L nirx 

b n ~ T I /(*) sm ~7~ dx 


nirx 

cos — — ax 

Li 


where n = 1, 2, • • • . For period 2 77 * we simply have (Sec. 11.1) 


fix) = a 0 + 2 (fl?i cos ** + s * n ***) 


with the Fourier coefficients of fix) (Sec. 11.1) 

If 17 1 f 77 If 77 

= — J /(a) rfv, a n = — J fix) cos rtA<&, = — J fix) sin /?a d*. 

Z77 j — 7T 'TT J — tt 7T — tt 

Fourier series are fundamental in connection with periodic phenomena, 
particularly in models involving differential equations (Sec. 1 1.5, Chap. 12). If f(x) 
is even [/(—a) = /(a)] or odd [/(-a) = —fix)], they reduce to Fourier cosine or 
Fourier sine series, respectively (Sec. 1 1.3). If fix) is given for 0 ^ x ^ L only, 
it has two half-range expansions of period 2 L, namely, a cosine and a sine series 
(Sec. 11.3). 
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The set of cosine and sine functions in (1) is called the trigonometric system. 
Its most basic property is its orthogonality on an interval of length 2 L; that is, for 
all integers m and n ¥= m we have 


/ 


mmc time 
cos — - — cos — — dx = 0, 
-l L L 


r 


mmc nTrx 
sin — - — sin — — dx = 0 
-l L L 


and for all integers m and /z, 


f 


mmc nmc 
cos — - — sin — — dx = 0. 
-l L L 


This orthogonality was crucial in deriving the Euler formulas (2). 

Partial sums of Fourier series minimize the square error (Sec. 11.6). 

Ideas and techniques of Fourier series extend to nonperiodic functions f(x) defined 
on the entire real line; this leads to the Fourier integral 


(3) f(x) = f [A(w) cos wx + B(w) sin wx] dw (Sec. 11.7) 

J o 

where 

1 r 1 r x 

(4) A(w) = — I f(v ) cos wv du , B(w) = — I f(u) sin wv dv 

7 T J-cc. 7T 

or, in complex form (Sec. 11.9), 


(5) 

where 

( 6 ) 




dw 


a = v=i) 


f(w) = 


V2tt J - 


f f(x)e~ iwx dx. 

CC 


Formula (6) transforms /(x) into its Fourier transform f{w), and (5) is the inverse 
transform. 

Related to this are the Fourier cosine transform (Sec. 1 1 .8) 


(7) 


f c (w) = 



f(x) cos wx dx 


and the Fourier sine transform (Sec. 1 1.8) 


( 8 ) 


fsM = 



f(x) sin wx dx. 


The discrete Fourier transform (DFT) and a practical method of computing it, 
called the fast Fourier transform (FFT), are discussed in Sec. 1 1 .9. 





CHAPTER 1 2 

Partial Differential Equations 
(PDEs) 


PDEs are models of various physical and geometrical problems, arising when the unknown 
functions (the solutions) depend on two or more variables, usually on time t and one or 
several space variables. It is fair to say that only the simplest physical systems can be 
modeled by ODEs, whereas most problems in dynamics, elasticity, heat transfer, 
electromagnetic theory, and quantum mechanics require PDEs. Indeed, the range of 
applications of PDEs is enormous, compared to that of ODEs. 

In this chapter we concentrate on the most important PDEs of applied mathematics, the 
wave equations governing the vibrating string (Sec. 12.2) and the vibrating membrane 
(Sec. 12.7), the heat equation (Sec. 12.5), and the Laplace equation (Secs. 12.5, 12.10). 
We derive these PDEs from physics and consider methods for solving initial and 
boundary value problems, that is, methods of obtaining solutions satisfying conditions 
that are given by the physical situation. 

In Secs. 12.6 and 12.1 1 we show that PDEs can also be solved by Fourier and Laplace 
transform methods. 

COMMENT. Numerics for PDEs is explained in Secs. 21.4-21.7. 

Prerequisites: Linear ODEs (Chap. 2), Fourier series (Chap. 11) 

Sections that may be omitted in a shorter course: 12.6, 12.9-12.1 1 

References and Answers to Problems: App. I Part C, App. 2 


12.1 Basic Concepts 

A partial differential equation (PDE) is an equation involving one or more partial 
derivatives of an (unknown) function, call it m, that depends on two or more variables, 
often time / and one or several variables in space. The order of the highest derivative is 
called the order of the PDE. As for ODEs, second-order PDEs will be the most important 
ones in applications. 

Just as for ordinary differential equations (ODEs) we say that a PDE is linear if it is 
of the first degree in the unknown function u and its partial derivatives. Otherwise we call 
it nonlinear. Thus, all the equations in Example 1 on p. 536 are linear. We call a linear 
PDE homogeneous if each of its terms contains either u or one of its partial derivatives. 
Otherwise we call the equation nonhomogeneous. Thus, (4) in Example 1 (with f not 
identically zero) is nonhomogeneous, whereas the other equations are homogeneous. 
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EXAMPLE 1 


THEOREM 1 


Important Second-Order PDEs 


0) 

(2) 

( 3 ) 

( 4 ) 

( 5 ) 

(6) 


t) 2 u 

2 ^ 11 

m 2 

= C 2 o 

r)x 2 

f)U 

A 

<)t 

ii 
r ) 

*ti 

A 

a 2 ,, 

a** 

+ 2 = 

dy 2 


;> 2 u 

bx 2 

+ o = 

dy 2 


A A \ 

cXv 2 dy 2 j 


B 2 u 

A 

<) 2 u 

dx 2 

+ o + 

dy 2 

9z 2 


4 -*( 

fit 2 \ 


One-dimensional wave equation 
One-dimensional heat equation 
Two-dimensional Laplace equation 
Two-dimensional Poisson equation 
Two-dimensional wave equation 
Three-dimensional Laplace equation 


Here c is a positive constant, t is time, .v, v , z are Cartesian coordinates, and dimension is the number of these 
coordinates in the equation. M 


A solution of a PDE in some region R of the space of the independent variables is a 
function that has ail the partial derivatives appearing in the PDE in some domain D 
(definition in Sec. 9.6) containing R, and satisfies the PDE everywhere in R. 

Often one merely requires that the function is continuous on the boundary of R, has 
those derivatives in the interior of R, and satisfies the PDE in the interior of R. Letting 
R lie in D simplifies the situation regarding derivatives on the boundary of R, which is 
then the same on the boundary as it is in the interior of R. 

In general, the totality of solutions of a PDE is very large. For example, the functions 

(7) u = x 2 — y 2 , u = e x cos y, it = sin x coshy, u = In ( x 2 + y 2 ) 


which are entirely different from each other, are solutions of (3), as you may verify. We 
shall see later that the unique solution of a PDE corresponding to a given physical problem 
will be obtained by the use of additional conditions arising from the problem. For 
instance, this may be the condition that the solution u assume given values on the boundary 
of the region R (“boundary conditions”)* Or, when time t is one of the variables, u (or 
u t = du/dt or both) may be prescribed at t = 0 (“initial conditions”). 

We know that if an ODE is linear and homogeneous, then from known solutions we 
can obtain further solutions by superposition. For PDEs the situation is quite similar: 


Fundamental Theorem on Superposition 

If Ui and u 2 are solutions of a homogeneous linear PDE in some region R, then 

It = Cilli + C 2 «2 

with any constants c x and c 2 is also a solution of that PDE in the region R. 


The simple proof of this important theorem is quite similar to that of Theorem 1 in 
Sec. 2.1 and is left to the student. 
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Verification of solutions in Probs. 14-25 proceeds as for ODEs. Problems 1—12 concern 
PDEs solvable like ODEs. To help the student with them, we consider two typical 
examples. 

EXAMPLE 2 Solving u xx - u = 0 Like an ODE 

Find solutions u of the PDE u xx — it = 0 depending on x and y. 

Solution . Since no ^-derivatives occur, we can solve this PDE like u " - u = 0. In Sec. 2.2 we would have 
obtained u = Ae x -I- Be~ x with constant A and B. Here A and B may be functions of y, so that the answer is 

m(a\ y) = A(y)e x + B(y)e~ x 

with arbitrary functions A and B. We thus have a great variety of solutions. Check the result by differentiation. ■ 

EXAMPLE 3 Solving u xy = -u x Like an ODE 

Find solutions u = u(x, y) of this PDE. 

Solution . Setting u x = p, we have p y = —p. p y !p = - 1 , Inp = —y + c(.r). p = c(x)e~ y and by 
integration with respect to a, 

u(x, y) = f{x)c~ y + g(y) where /(.v) = J c(.v) dx; 
here. f(x) and g(v) are arbitrary. ■ 


~P]I n'P~I C KA C CT 1 7 

■ ■ — 

zr:K^Jrc:clVl a x 1 i 2 

^ 


1-12 PDEs SOLVABLE AS ODEs 


This happens if a PDE involves derivatives with respect to 
one variable only (or can be transformed to such a form), 
so that the other variable(s) can be treated as parameters ). 
Solve for it = u{x, y): 


1 . u yy 4 - 1 6u = 0 
3. Uyy ”” 0 
5. iiy 4 it = e xy 
7. ity = (coshA*)>w 
9. y 2 u vy + 2 ya y — 2u = 0 

11. u xy = u x 

12 . ityy + 10 u y 4 - 25 u = e~ 5y 


2 . u xx = u 
4. ity + 2 yu = 0 
6. ttjg* = 4y 2 « 

8 . 

10. Uyy — Axlly 


13. (Fundamental Theorem) Prove Fundamental 
Theorem 1 for second-order PDEs In two and three 
independent variables. 


14-25 


VERIFICATION OF SOLUTIONS 


Verify (by substitution) that the given function is a solution 
of the indicated PDE. Sketch or graph the solution as a 
surface in space. 


14-17 Wave Equation (1) with suitable c 

14. u = 4a* 2 + t 2 15. u = sin 8a cos 2 1 


16. u = sin 3a sin 18f 17. u — sin kx cos kct 


1 1 8-2 1 1 Heat Equation (2) with suitable c 
18. it — e~ zkt cos 8a 19. u = e -75-2 * sin 4a 

20. m = e~ Ao>it sin <ox 21. u = cos o>x 

1 22— 25 1 Laplace Equation (3) 

22. u in (7) in the text 23. u = cos 2 y sinh 2a 

24. m = arctan (y/x) 25. u = e x2 ~ y * sin 2a y 

26. TEAM PROJECT. Verification of Solutions 

(a) Wave equation. Verify that 

a{ a, t) = v{x + ct) + w ( x - ct) with any twice 
differentiable functions v and w satisfies (1). 

(b) Poisson equation. Verify that each u satisfies (4) 
with /(a, y) as indicated. 


4 i 4 

it = x 4- y 

f = 12(x z + y 2 ) 

u = cos a sin y 

f = —2 cos x sin y 

it = y/x 

f = 2 y/x 3 


(c) Lapl ace equation. Verify that 
u — 1/Va 2 + y 2 4- z 2 satisfies (6) and 
u = In (a 2 + y 2 ) satisfies (3). Is u = 1/Va 2 + y 2 a 
solution of (3)? Of what Poisson equation? 
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(d) Verity that u with any (sufficiently often 
differentiable) v and w satisfies the given PDE. 

u = v(x) 4- u*(y) t( xy = 0 

it = u(A)ir(y) ttu xy = u x it y 

u = u(a + 3r) + vi ? (.v — 3/) u tt ~ 9u xx 

27* (Boundary value problem) Verify that the function 
m(a\ y) = ci In (.v 2 4- y 2 ) 4- b satisfies Laplace’s 


equation (3) and determine a and b so that u satisfies 
the boundary conditions u = 110 on the circle 
a 2 4- y 2 = 1 and u = 0 on the circle a 2 + y 2 = 100. 


28-30 

Solve 


SYSTEMS OF PDEs 


28. u x = 0, u y = 0 

29. u xx = 0, u xy = 0 

30. It XX 0, Uyy 0 


12.1 Modeling: Vibrating String, Wave Equation 

As a first important PDE let us derive the equation modeling small transverse vibrations 
of an elastic string, such as a violin string. We place the string along the A-axis, stretch it 
to length L, and fasten it at the ends x = 0 and x = L. We then distort the string, and at 
some instant, call it t = 0, we release it and allow it to vibrate. The problem is to determine 
the vibrations of the string, that is, to find its deflection u( x, /) at any point a* and at any 
time t > 0; see Fig. 283. 

w( a\ /) will be the solution of a PDE that is the model of our physical system to be 
derived. This PDE should not be too complicated, so that we can solve it. Reasonable 
simplifying assumptions (just as for ODEs modeling vibrations in Chap. 2) are as 
follows. 

Physical Assumptions 

1. The mass of the string per unit length is constant (“homogeneous string”). The string 
is perfectly elastic and does not offer any resistance to bending. 

2. The tension caused by stretching the string before fastening it at the ends is so large 
that the action of the gravitational force on the string (trying to pull the string down 
a little) can be neglected. 

3. The string performs small transverse motions in a vertical plane; that is, every particle 
of the string moves strictly vertically and so that the deflection and the slope at every 
point of the string always remain small in absolute value. 

Under these assumptions we may expect solutions u{ x, t) that describe the physical 
reality sufficiently well. 




Fig. 283. Deflected string at fixed time t. Explanation on p. 539 
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Derivation of the PDE of the Model 
(“Wave Equation”) from Forces 

The model of the vibrating string will consist of a PDE (“wave equation”) and additional 
conditions. To obtain the PDE, we consider the forces acting on a small portion of the 
string (Fig. 283). This method is typical of modeling in mechanics and elsewhere. 

Since the string offers no resistance to bending, the tension is tangential to the curve 
of the string at each point. Let T x and T 2 be the tension at the endpoints P and Q of that 
portion. Since the points of the string move vertically, there is no motion in the horizontal 
direction. Hence the horizontal components of the tension must be constant. Using the 
notation shown in Fig. 283, we thus obtain 

(1) T 1 cos a = T 2 cos P = T = const. 


In the vertical direction we have two forces, namely, the vertical components —T x sin a 
and 7*2 sin P of T x and T 2 \ here the minus sign appears because the component at P is 
directed downward. By Newton’s second law the resultant of these two forces is equal 
to the mass p Ax of the portion times the acceleration d 2 uldt 2 , evaluated at some point 
between x and x + Ax; here p is the mass of the undeflected string per unit length, and 
Ax is the length of the portion of the undeflected string. (A is generally used to denote 
small quantities; this has nothing to do with the Laplacian V 2 . which is sometimes also 
denoted by A.) Hence 

d 2 u 

T 2 sin P - T x sin a = p Ax — ^ . 

dr 


Using (1), we can divide this by T 2 cos P = T x cos a = T, 


(2) 


T 2 sin ft 
T 2 cos p 


T x sin a 
T x cos a 


= tan P — tan a = 


Now tan a and tan p are the slopes of the string at x and x 


obtaining 

p Ax d 2 u 
T ~dt W ’ 

+ Ax: 


tan a = 



and 


tan p — 



Here we have to write partial derivatives because a depends also on time t. Dividing (2) 
by Ax, we thus have 


I 

Ax 


(— ) ' 

_ \ dx / ;i;+A.r \ cU* / x_ 


If we let Ax approach zero, we obtain the linear PDE 

(3) 


d 2 u _ 2 d 2 u 


c)i 


dx‘ 


2 » 


p d 2 U 

T ~dt 2 ’ 


c 2 = 


T 

9 ' 


This is called the one-dimensional wave equation. We see that it is homogeneous and 
of the second order. The physical constant Tip is denoted by c 2 (instead of c) to indicate 
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that this constant is positive, a fact that will be essential to the form of the solutions. 
“One-dimensional” means that the equation involves only one space variable, x. In the 
next section we shall complete setting up the model and then show how to solve it by a 
general method that is probably the most important one for PDEs in engineering 
mathematics. 


12.3 Solution by Separating Variables. 
Use of Fourier Series 


The model of a vibrating elastic 
one-dimensional wave equation 

( 1 ) 


string (a violin string, for instance) consists of the 


d 2 u 0 d 2 u 9 T 

—o = C 2 — o C = — 

dt 2 dx 2 p 


for the unknown deflection m(x, t) of the string, a PDE that we have just obtained, and 
some additional conditions, which we shall now derive. 

Since the string is fastened at the ends a- = 0 and x = L (see Sec. 12.2), we have the 

two boundary conditions 


(2) (a) m(0, t) = 0, (b) w(L, /) = 0 for all /. 


Furthermore, the form of the motion of the string will depend on its initial deflection 
(deflection at time t — 0), call it /(a), and on its initial velocity (velocity at t = 0), call 
it g( a). We thus have the two initial conditions 

(3) (a) «( a, 0) = /(a), (b) u t ( a, 0) = g( a) (0 ^ a ^ L) 


where u t = dulcit. We now have to find a solution of the PDE (1) satisfying the conditions 
(2) and (3). This will be the solution of our problem. We shall do this in three steps, as 
follows. 

Step 1. By the “method of separating variables” or product method , setting 
n(x, t) = F(a)G(/), we obtain from (1 ) two ODEs, one for F( a) and the other one for G(r). 

Step 2. We determine solutions of these ODEs that satisfy the boundary conditions (2). 

Step 3 . Finally, using Fourier series, we compose the solutions gained in Step 2 to obtain 
a solution of (1) satisfying both (2) and (3), that is, the solution of our model of the 
vibrating string. 


Step 1. Two ODEs from the Wave Equation (1) 

In the method of separating variables, or product method, we determine solutions of the 
wave equation ( 1 ) of the form 


(4) 


u(x, t) = F(x) G(t) 
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which are a product of two functions, each depending only on one of the variables x and 
t. This is a powerful general method that has various applications in engineering 
mathematics, as we shall see in this chapter. Differentiating (4), we obtain 



and 



where dots denote derivatives with respect to t and primes derivatives with respect to a. 
By inserting this into the wave equation (1) we have 

FG = c 2 F”G. 

Dividing by c 2 FG and simplifying gives 

G _ F" 

~c 2 G “ ~F ' 

The variables are now separated, the left side depending only on t and the right side only 
on x. Hence both sides must be constant because if they were variable, then changing 
t or x would affect only one side, leaving the other unaltered. Thus, say, 

G F" 

~c*G ~ ~F ~ k ' 

Multiplying by the denominators gives immediately two ordinary DEs 


(5) 

O 

II 

1 

and 


(6) 

G - c 2 kG = 0. 


Here, the separation constant k is still arbitrary'. 


Step 2. Satisfying the Boundary Conditions (2) 

We now determine solutions F and G of (5) and (6) so that u = FG satisfies the boundary 
conditions (2), that is, 

(7) «(( 0, t) = F(0)G(t) = 0, u{L, /) = F(L)G{t) = 0 for all /. 

We fust solve (5). If G = 0, then u = FG = 0, which is of no interest. Hence G # 0 
and then by (7), 

(8) (a) F(0) = 0, (b) F(L) = 0. 

We show that k must be negative. For k = 0 the general solution of (5) is F = ax + b, 
and from (8) we obtain a = b = 0, so that F = 0 and u = FG = 0, which is of no interest. 
For positive k = /x 2 a general solution of (5) is 


F = Ae** + 
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and from (8) we obtain F = 0as before (verify!). Hence we are left with the possibility 
of choosing k negative, say, k = -/? 2 . Then (5) becomes F" + p 2 F = 0 and has as a 
general solution 

F(x) = A cos px -f B sin px. 

From this and (8) we have 

F( 0) = A = 0 and then F(L) = B sin pL = 0. 

We must take B ^ 0 since otherwise F = 0. Hence sin pL = 0. Thus 

ni t 

(9) pL = «7r, so that p = — - (n integer). 

La 

Setting B = I, we thus obtain infinitely many solutions F(x) — F n (x), where 

(10) F n (x) = sin ~~”A* (n = 1, 2, • • •)• 

These solutions satisfy (8). [For negative integer n we obtain essentially the same solutions, 
except for a minus sign, because sin (-a) = —sin a.] 

We now solve (6) with k = — p 2 = -(, nir/L ) 2 resulting from (9), that is, 

(11*) G + A 2 G = 0 where An = cp = — . 

A general solution is 

G n {t) = B n cos A^/ + B n * sin A^/. 

Hence solutions of (1) satisfying (2) are w n (x, /) = F n (x)G n (t) = GnCOF^A), written out 

nrr 

(11) w n U\ /) = ( B n cos A tl / + £ n * sin An/) sin —x (n = 1, 2, • • •)* 


These functions are called the eigenfunctions, or characteristic functions , and the values 
An = cnirlL are called the eigenvalues, or characteristic values , of the vibrating string. 
The set {A 1? A 2 , • • •} is called the spectrum. 

Discussion of Eigenfunctions. We see that each u n represents a harmonic motion having 
the frequency A,/27 t = cn!2L cycles per unit time. This motion is called the nth normal 
mode of the string. The first normal mode is known as the fundamental mode (n = 1), 
and the others are known as overtones ; musically they give the octave, octave plus fifth, 
etc. Since in (1 1) 


sin 


HTTX 

~r 


= 0 


at 


_ L_ 2L 
n ’ n 


n - 1 
n 


L. 


the nth normal mode has n — I nodes, that is, points of the string that do not move (in 
addition to the fixed endpoints); see Fig. 284. 
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71=1 


77 = 2 


n = 3 


77 . = 4 


Fig. 284. Normal modes of the vibrating string 


Figure 285 shows the second normal mode for various values of /. At any instant the 
string has the form of a sine wave. When the left part of the string is moving down, the 
other half is moving up, and conversely. For the other modes the situation is similar. 

Tuning is done by changing the tension T. Our formula for the frequency XJlir — cn!2L 
of u n with c = Vffp [see (3), Sec. 12.2] confirms that effect because it shows that the 
frequency is proportional to the tension. T cannot be increased indefinitely, but can you 
see what to do to get a string with a high fundamental mode? (Think of both L and p.) 
Why is a violin smaller than a double-bass? 



Fig. 285. Second normal mode for various values of t 


Step 3. Solution of the Entire Problem. Fourier Series 

The eigenfunctions (11) satisfy the wave equation (1) and the boundary conditions (2) 
(string fixed at the ends). A single u n will generally not satisfy the initial conditions (3). 
But since the wave equation (1) is linear and homogeneous, it follows from Fundamental 
Theorem 1 in Sec. 12.1 that the sum of finitely many solutions u n is a solution of (1). To 
obtain a solution that also satisfies the initial conditions (3), we consider the infinite series 
(with ^ = cmtIL as before) 


(12) u(x, /) = 2 «nU'» 0 = 2 (fin cos + B n * sin k n t) sin — x. 

7i= 1 n=l 

Satisfying Initial Condition (3a) (Given Initial Displacement). From (12) and (3a) 
we obtain 

CO 

(13) u(x, 0) = X B n sin —x = f(x). 

n= 1 L 

Hence we must choose the B n 's so that u(x t 0) becomes the Fourier sine series of f(x). 
Thus, by (4) in Sec. 1 1.3, 


(14) 



dx , 


n = 1, 2, • • • . 
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Satisfying Initial Condition (3b) (Given Initial Velocity). Similarly, by differentiating 
(12) with respect to t and using (3b), we obtain 


du v 117TX 

— = 2j (~B n K sin A n t + B n *\n cos A„/) sin —— 

t=0 Ln=l L _ t=0 


^ /I7TA 

= 2 fi«*An sin — — = g(x). 

n—1 L 

Hence we must choose die B n *’s so that for t = 0 the derivative du/dt becomes the Fourier 
sine series of £(a). Thus, again by (4) in Sec. 1 1.3, 


2 r L mtx 

B n *K = T Six) sm — — dx. 

L J o L 


Since \ n = cnirlL, we obtain by division 


2 r tiirx 

= g(x) sin — — dx , 

C«7T L 


n = 1, 2, 


Result. Our discussion shows that u(x , /) given by (12) with coefficients (14) and (15) 
is a solution of (1) that satisfies all the conditions in (2) and (3), provided the series (12) 
converges and so do the series obtained by differentiating (12) twice termwise with respect 
to x and t and have the sums d 2 «/d* 2 and d 2 u/dt 2 t respectively, which are continuous. 

Solution (12) Established. According to our derivation the solution (12) is at first a 
purely formal expression, but we shall now establish it. For the sake of simplicity we 
consider only the case when the initial velocity g(x) is identically zero. Then the B n * are 
zero, and (12) reduces to 


u(x, t) = 2 B n cos \ n t sin 


A” L 


It is possible to sum this series , that is, to write the result in a closed or finite form. For 
this purpose we use the formula [see (11), App. A3.1] 


C/7 77* YlTt 

cos — — t sin —X 

L L 


1 f /ITT 

= T L sln u 


( x — cm + sin \ — - ( x H- ct) 


Consequently, we may write (16) in the form 


1^ f niT ] 1^» f n'TT 

u(x, t) = — 2u sm | — (x - ct) + — 2j K s»n — (x + ct) ■ . 

n=l f J n=l t 

These two series are those obtained by substituting a* — ct and a + ct , respectively, for 
the variable a in the Fourier sine series (13) for /(a). Thus 


(17) 


U(x, t) = §[/*(* - ct) + f*(x + ct)] 
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EXAMPLE 1 


where /* is the odd periodic extension of / with the period 2 L (Fig. 286). Since the initial 
deflection f(x) is continuous on the interval 0 ^ x ^ L and zero at the endpoints, it follows 
from (17) that m(jc, t) is a continuous function of both variables x and t for all values of 
the variables. By differentiating (17) we see that u(x , /) is a solution of (1), provided f(x) 
is twice differentiable on the interval 0 < x < L, and has one-sided second derivatives at 
x = 0 and x = L, which are zero. Under these conditions u(x> t ) is established as a solution 
of (1), satisfying (2) and (3) with g(x) = 0. ■ 



Fig. 286. Odd periodic extension of f(x) 


Generalized Solution. If f ' (*) and / "(x) are merely piecewise continuous (see Sec. 6. 1 ), 
or if those one-sided derivatives are not zero, then for each t there will be finitely many 
values of x at which the second derivatives of u appearing in (1) do not exist. Except at 
these points the wave equation will still be satisfied. We may then regard m(j t\ t) as a 
“generalized solution,” as it is called, that is, as a solution in a broader sense. For instance, 
a triangular initial deflection as in Example l (below) leads to a generalized solution. 

Physical Interpretation of the Solution (17). The graph of f*(x — ct) is obtained from 
the graph of f*(x) by shifting the latter ct units to the right (Fig. 287). This means that 
/*(; x — ct) (c > 0) represents a wave that is traveling to the right as t increases. Similarly, 
/*(* 4- ct) represents a wave that is traveling to the left, and u(x , t) is the superposition 
of these two waves. 



Vibrating String if the Initial Deflection Is Triangular 

Find the solution of the wave equation (1) corresponding to the triangular initial deflection 


/« = 


2k 

~ {L ~ X) 


if 


L 

— C * C L 
2 


and initial velocity zero. (Figure 288 shows f(x) = m(.y, 0) at the top.) 

Solution . Since g(x) s 0. we have B n * = 0 in ( 12), and from Example 4 in Sec. 1 1.3 we see that the B n 
are given by (5), Sec. 11.3. Thus (12) takes the form 

. v 8k T 1 tt 7 tc 1 3tt 37 rc 

u(x, 0 = — — j sin —v cos — / - — g sin —x cos — —t + - • • • 

77 L 1 L 3 L L 

For graphing the solution we may use u(x % 0) = f(x) and the above interpretation of the two functions in the 
representation (17). This leads to the graph shown in Fig. 288. ■ 
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0 L 0 L 




= \f*(x + L) 

Fig. 288. Solution u(x, f) in Example 1 for various values of t (right part 
of the figure) obtained as the superposition of a wave traveling to the 
right (dashed) and a wave traveling to the left (left part of the figure) 




TEtBSEBrEEM==&£J— T: 


T-Io] DEFLECTION OF THE STRING 

Find w(a, t) for the string of length L = 1 and c 2 = 1 when 
the initial velocity is zero and the initial deflection with 
small k (say, 0.01) is as follows. Sketch or graph u(x, t ) as 
in Fig. 288. 

1. k sin 2 t tx 2. k ( sin t tx - 5 sin 3nx) 

3. kx( 1 — x) 4. kx(i — a* 2 ) 

5. 

0.1 

0.5 1 
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11. (Frequency) How does the frequency of the 
fundamental mode of the vibrating string depend on 
the length of the suing? On the mass per unit length? 
What happens to the string if we double the tension? 
Why is a contrabass larger than a violin? 

12. (Nonzero initial velocity) Find the deflection w(x, t) 
of the string of length L = tt and c 2 = 1 for zero 
initial displacement and “triangular” initial velocity 
« t (jc, 0) = 0.0 l.v if 0 = a = r, u t {x, 0) = 0.01 (tt — x) 
if \tt = a = tt. (Initial conditions with u t ( a, 0) ^ 0 are 
hard to realize experimentally.) 

13. CAS PROJECT. Graphing Normal Modes. Write a 
program for graphing u n with L = tt and c 2 of your 
choice similarly as in Fig. 284. Apply the program to 
u 2 , u 3j w 4 . Also graph these solutions as surfaces over 
the jtf-plane. Explain the connection between these two 
kinds of graphs. 

14. TEAM PROJECT. Forced Vibrations of an Elastic 
String. Show the following. 

(a) Substitution of 

(17) u(a, 0 = 2 G n ( 0 sin — — 

n=l L 

(L = length of the string) into the wave equation (1) 
governing free vibrations leads to [see (10*)] 


If A n 2 =£ to 2 , the solution is 


G n (t) = B n cos A n t 4- B n * sin A n / 


+ 


2A(1 — cos/itt) 
/?7r(A n 2 — (o 2 ) 


sin (ot. 


Determine B n and B n * so that u satisfies the initial 
conditions u(x, 0) = /(a), u t (x> 0) = 0. 

(d) (Resonance) Show that if A*. = <w, then 

G„(t) = B n cos (ot + sin <or 
A 

— (1 — cos n7r)t cos cot. 

tl 7 T 0 J 


(e) (Reduction of boundary conditions) Show that 
a problem (1)— (3) with more complicated boundary 
conditions u( 0, t ) = 0, it(L, t) = /?.(/), can be reduced 
to a problem for a new function v satisfying conditions 
u(0, /) = v{L y t) = 0. u(a, 0) = v t (x y 0) = 8l (x) 
but a nonhomogeneous wave equation. Hint: Set 
it = u - f w and determine w suitably. 



.. 9 cn TT 

(18) G n + A „ 2 G = 0, A,, = — . 


15-20 


SEPARATION OF A FOURTH-ORDER PDL 
VIBRATING BEAM 


(b) Forced vibrations of the string under an external 
force P( a, r) per unit length acting normal to the string 
are governed by the PDE 


By the principles used in modeling the string it can be 
shown that small free vertical vibrations of a uniform elastic 
beam (Fig. 289) are modeled by the fourth-order PDE 


(19) 


«tt = c 2 u xx + 


£ 

P * 


( 21 ) 


d 2 u _ 2 d 4 u 

Ji 2 “ " c 


(Ref. [Cl 1]) 


(c) For a sinusoidal force P = Ap sin cot we obtain 

P ™ U 7 TA 


i ^ n n -v 

— = A sin (ot - 2 j knW sin ”7“ » 
P n-l L 


( 20 ) 


UO = 


\(4Afmr) sin cot ( n odd) 

0 (n even). 

Substituting (17) and (20) into (19) gives 

2 2A 

G n + A n G n = — (1 - cos mr) sin cot. 
mr 


where c 2 = ElfpA ( E = Young’s modulus of elasticity, 
I = moment of intertia of the cross section with respect to 
the y-axis in the figure, p = density, A = cross-sectional 
area). {Bending of a beam under a load is discussed in 
Sec. 3.3.) 

15. Substituting u = F{x)G(t) into (21), show that 
F“VF = -G/c 2 G = p 4 = const, 

F{ a) = A cos fix + B sin fix 

■F C cosh fix 4- D sinh fix , 

G(t) = a cos cfi 2 t 4- b sin cfi 2 t. 
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(A) Simply supported 


x = L 



x - 0 x = L 


(B) Clamped at both 
ends 


18. Compare the results of Probs. 17 and 3. What is the 
basic difference between the frequencies of the 
normal modes of the vibrating string and the vibrating 
beam? 

19. (Clamped beam in Fig. 290B) What are the boundary 
conditions for the clamped beam in Fig. 290B? Show 
that F in Prob. 15 satisfies these conditions if PL is a 
solution of the equation 


(C) Clamped at the left 
end, free at the 
1 . right end 

x = 0 x - L 

Fig. 290. Supports of a beam 

16. (Simply supported beam in Fig. 290A) Find solutions 
u n = F n (x)G n (t ) of (21) corresponding to zero initial 
velocity and satisfying the boundary conditions (sec 
Fig. 290A) 

m(0, /) = 0, u(L y t) = 0 

(ends simply supported for all times /), 


(22) cosh j 3L cos PL = I . 

Determine approximate solutions of (22), for instance, 
graphically from the intersections of the curves of 
cos PL and 1/cosh pL. 

20. (Clampcd-free beam in Fig. 290C) If the beam is 
clamped at the left and free at the right (Fig. 290C), 
the boundary conditions are 

«(0, t) = 0, m*(0, t) = 0, 

u xx(L. 0 = 0, u xxx (L y t ) — 0. 


U XX (0y /) = 0, U XX (Ly t) = 0 
(zero moments, hence zero curvature, at the ends). 

17. Find the solution of (21) that satisfies the conditions in 
Prob. 16 as well as the initial condition 

u(Xy 0) = f(x) = x(L - x). 


Show that F in Prob. 15 satisfies these conditions if p L 
is a solution of the equation 

(23) cosh pL cos PL = - 1 . 

Find approximate solutions of (18). 


12.4 D’Alembert’s Solution 
of the Wave Equation. 

Characteristics 

It is interesting that the solution (17), Sec. 12.3, of the wave equation 

d 2 u 9 d 2 u 2 T 

c= 7’ 

can be immediately obtained by transforming (1) in a suitable way, namely, by introducing 
the new independent variables 


(2) v = x + cl w = x — ct. 

Then u becomes a function of u and w. The derivatives in (1) can now be expressed in 
terms of derivatives with respect to v and w by the use of the chain rule in Sec. 9.6. 
Denoting partial derivatives by subscripts, we see from (2) that v x = 1 and w x = 1. For 
simplicity let us denote m(jc, /)> as a function of v and w, by the same letter u. Then 


^ = Wx + Wx = + *ur 
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We now apply the chain rule to the right side of this equation. We assume that all the 
partial derivatives involved are continuous, so that u wv = u vw . Since v x — 1 and w x = 1, 
we obtain 

"f* W w )v (Mv V^X o)w^X ^VV 4“ l 1 \qw 

Transforming the other derivative in (1) by the same procedure, we find 

Mtt ^ (Mu V 2lt vw 4 “ Mww ) • 

By inserting these two results in (1) we get (see footnote 2 in App. A3.2) 


( 3 ) 


m vw 


d 2 u 
dw dv 


= 0 . 


The point of the present method is that (3) can be readily solved by two successive 
integrations, first with respect to w and then with respect to v. This gives 


du 

— = Kv) 

dV 


and u = J/j(u) dv 4- «A(w). 


Here h(v) and ip(w) are arbitrary functions of v and w, respectively. Since the integral is 
a function of u, say, 0(u), the solution is of the form u = <f)(v) 4- In terms of 

.v and /, by (2), we thus have 


( 4 ) 


u(x, t) = (j>(x 4- ct) 4- i[/(x - cl). 


This is known as d’Alembert’s solution 1 of the wave equation (1). 

Its derivation was much more elegant than the method in Sec. 12.3, but d’Alembert’s 
method is special, whereas the use of Fourier series applies to various equations, as we 
shall see. 

D'Alembert's Solution Satisfying the Initial Conditions 

(5) (a) u(x, 0) = f(x), (b) u t (x 9 0) = g(x). 

These are the same as (3) in Sec. 12.3. By differentiating (4) we have 

(6) u t 0 c, t ) = c<f> r (x 4- ct) — ct// (x — ct) 


2 JEAN LE ROND D’ALEMBERT (1717-1783). French mathematician, also known for his important work 
in mechanics. 

We mention that the general theory of PDEs provides a systematic way for finding the transformation (2) that 
simplifies (1). See Ref. [C8] in App. I. 
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where primes denote derivatives with respect to the entire arguments a- + ct and x — cl, 
respectively, and the minus sign comes from the chain rule. From (4)-(6) we have 

(7) u{x, 0) = <f>(x) + < !t(x) = f{x), 

(8) u t (x, 0) = c<£'( x) - ci/j\x) = g(x ). 

Dividing (8) by c and integrating with respect to x, we obtain 

(9) (j>(x) - i//(x) = k( x 0 ) + — f g(s) ds, k{x Q ) = <£(a‘ 0 ) - ip(x 0 ). 

C J X 0 

If we add this to (7), then i/r drops out and division by 2 gives 

1 1 r x 1 

(10) 0(x) = — /(a) + — I g(s) ds + — k(x 0 ). 

2 2c wVo 2 

Similarly, subtraction of (9) from (7) and division by 2 gives 

(11) 0(A) = “ fix) - 2- f gis ) ds - k(x 0 ). 

2 2c 2 

In (10) we replace a by a + cr; we then get an integral from a 0 to a- + ct. In (1 1) we 
replace x by a - ct and get minus an integral from x 0 to a — ct or plus an integral from 
a — ct to a 0 . Hence addition of 0 ( a + ct) and 0(x — ct) gives u(x, t) [see (4)] in the form 

1 1 r x+ct 

(12) k(a, t) = — [/(a + ct) + fix - cr)] + — I g(s) ds. 

2 2c J x—ct 

If the initial velocity is zero, we see that this reduces to 

(13) m(a, t) = 3 [/(a + ct) + fix - ct)], 

in agreement with (17) in Sec. 12.3. You may show that because of the boundary conditions 
(2) in that section the function / must be odd and must have the period 2 L. 

Our result shows that the two initial conditions [the functions fix) and gix) in (5)] 
determine the solution uniquely. 

The solution of the wave equation by the Laplace transform method will be shown in 
Sec. 12.11. 


Characteristics. Types and Normal Forms of PDEs 

The idea of d’Alembert’s solution is just a special instance of the method of 
characteristics. This concerns PDEs of the form 


(14) 


Altxx ^BlAxy -f" CtAyy y, U , U x , Uy) 
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EXAMPLE 1 


(as well as PDEs in more than two variables). Equation (14) is called quasilinear because 
it is linear in the highest derivatives (but may be arbitrary otherwise). There are three 
types of PDEs (14), depending on the discriminant AC — 5 2 , as follows. 


Type 

Defining Condition 

Example in Sec. 12.1 

Hyperbolic 

AC- B 2 < 0 

Wave equation (1) 

Parabolic 

AC - B 2 = 0 

Heat equation (2) 

Elliptic 

AC - B 2 > 0 

Laplace equation (3) 


Note that (1) and (2) in Sec. 12.1 involve /, but to have y as in (14), we set y = ct in 
(1), obtaining u ti — = c 2 (u yy — u xx ) = 0. And in (2) we set y = c 2 t , so that 

u t - C 2 w ux = C 2 (u y - Uxx). 

A, B y C may be functions of a\ >\ so that a PDE may be of mixed type, that is, of 
different type in different regions of the xy-plane. An important mixed-type PDE is the 
Tricomi equation (see Prob. 10). 

Transformation of (14) to Normal Form. The normal forms of (14) and the 
corresponding transformations depend on the type of the PDE. They are obtained by 
solving the characteristic equation of (14), which is the ODE 

(15) A/ 2 - 2By' + C = 0 

where/ = dylclx (note —25, not +25). The solutions of (15) are called the characteristics 
of (14), and we write them in the form <1 >(a\ y) = const and ^(x, y) = const. Then the 
transformations giving new variables u, w instead of x , y and the normal forms of (14) 
are as follows. 


Type 

New Variables 

Normal Form 

Hyperbolic 

Parabolic 

Elliptic 

v = <f> 

V = X 

V = I«f> + 

w = 'P 
w = 4> = 'P 
W = ^t(<P - 'p) 

t'tvw 

U unv ~ ^2 
M vv ^wio ^3 


Here, 0 = 3 >( a % y ), V = ^(x, y), F x = w , u, u v , u w ), etc., and we denote it as 
function of u, w again by m, for simplicity. We see that the normal form of a hyperbolic 
PDE is as in d’Alembert’s solution. In the parabolic case we get just one family of solutions 
<P = 'P. In the elliptic case, / = V— I, and the characteristics are complex and are of 
minor interest. For derivation, see Ref. [GR3] in App. 1. 

D'Alembert’s Solution Obtained Systematically 

The theory of characteristics gives d’Alembert’s solution in a systematic fashion. To see this, we write the wave 
equation u tt - c 2 u xx = 0 in the form (14) by setting y = cl. By the chain rule, u t = tt y y t = cu y and 
«tt = c2{i yy Division by c 2 gives u xx - u yy — 0, as stated before. Hence the characteristic equation is 
.v /2 ~ 1 = (y* + IX/ ~ 1) = 0. The two families of solutions (characteristics) are 4 >(a. y) = y + .v = const 
and +X-V, y) = y - a* = const. This gives the new variables v = <1> = y + x - ct + .v and 
w - W = y - x = ct — x and d’Alembert’s solution it = fi(x + ct) + / 2 ( x - ct). ■ 
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1. Show that c is the speed of each of the two waves given 
by (4). 

2. Show that because of the boundary conditions (2), 
Sec. 12.3, the function / in (13) of this section must 
be odd and of period 2 L. 

3. If a steel wire 2 m in length weighs 0.9 nt (about 0.20 lb) 
and is stretched by a tensile force of 300 nt (about 67.4 
lb), what is the corresponding speed of transverse waves? 

4. What are the frequencies of the eigenfunctions in 
Prob. 3? 

5. Longitudinal Vibrations of an Elastic Bar or Rod. 
These vibrations in the direction of the .v-axis are 
modeled by the wave equation u tl = c 2 u xx , c 2 = E/p 
(see Tolstov [C9], p. 275). Tf the rod is fastened at one 
end, x = 0, and free at the other, a* = L, we have 
w(0, 0 = 0 and u x (L , /) = 0. Show that the motion 
corresponding to initial displacement u (a, 0) = /(a) 
and initial velocity zero is 

oc 

it = X sinp n A cos p n ct, 
n«0 

2 r L (2 it + 1 )tt 

A„ = Y J I /(*) sin p„x dx, p n = — . 


6^9] GRAPHING SOLUTIONS 

Using (13), sketch or graph a figure (similar to Fig. 288 in 
Sec. 12.3) of the deflection m(a, 1 ) of a vibrating string 
(length L = 1, ends fixed, c = I) starting with initial 
velocity 0 and initial deflection (k small, say, k = 0.01). 

6. f(x) = k sin ttx 7. /(a) = k(i — cos 2 tta) 

8. /(a) = kx( 1 - A") 9. /(A) = k (a* - A- 3 ) 

10. (Tricomi and Airy equations 2 ) Show that the Triconti 
equation yu xx + it yy = 0 is of mixed type. Obtain the 

Airy equation G" — yG = 0 from the Tricomi equation 
by separation. (For solutions, see p. 446 of Ref. [GR 1 J 
listed in App. 1.) 


1 1 1-20 1 NORMAL FORMS 

Find the type, transform to normal form, and solve. (Show 
the details of your work.) 


11. 

u xy 

- 

“w = 

= 0 

12. 


— 

2 »xy 

+ “vu = 

0 

13. 

11 XX 

+ 

9u ,m 

= 0 

14. 

ll xx 

+ 

lt xy 

- 2il w = 

0 

15. 

"xx 

+ 


+ ttyy = 0 

16. 

XUxy * 

■ yi'm = 0 


17. 

u xx 

- 

4“xy 

+ 4 llyy = 0 

18. 

“xx 

+ 

2 <lxy 


= 0 

19. 


0 

II 

20. 

"xx 

- 

4 "x U 

3llyy - 

= 0 


12.5 Heat Equation: Solution by Fourier Series 

From the wave equation we now turn to the next “big” PDE, the heat equation 




which gives the temperature m(a, y, <:, /) in a body of homogeneous material. Here c 2 is 
the thermal diffusivity, K the thermal conductivity, a the specific heat, and p the density 
of the material of the body. V 2 « is the Laplacian of w, and with respect to Cartesian 
coordinates a, y, z , 


V 2 « 


d 2 u d 2 u d 2 u 

a? + + d?' 


The heat equation was derived in Sec. 10.8. It is also called the diffusion equation. 

As an important application, let us first consider the temperature in a long thin metal 
bar or wire of constant cross section and homogeneous material, which is oriented along 
the A-axis (Fig. 29 1 ) and is perfectly insulated laterally, so that heat flows in the A-direction 


2 SIR GEORGE BIDELL AIRY (1801-1892), English mathematician, known for his work in elasticity. 

FRANCESCO TRICOMI (1897-1978), Italian mathematician, who worked in integral equations. 
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0 


x — L 

Fig. 291. Bar under consideration 


only. Then u depends only on x and time f, and the heat equation becomes the 
one-dimensional heat equation 


du g c> 2 
dt C dx 2 


This seems to differ only very little from the wave equation, which has a term u tt instead 
of w t , but we shall see that this will make the solutions of (1) behave quite differently 
from those of the wave equation. 

We shall solve (1) for some important types of boundary and initial conditions. We 
begin with the case in which the ends x = 0 and x = L of the bar are kept at temperature 
zero, so that we have the boundary conditions 

(2) w(0, t) = 0, w(L, /) = 0 for all t. 

Furthermore, the initial temperature in the bar at time t = 0 is given, say, /(*), so that we 
have the initial condition 

(3) u(x, 0) = f(x) [/(*) given]. 


Here we must have /(0) = 0 and f{L) = 0 because of (2). 

We shall determine a solution u(x, t) of (1) satisfying (2) and (3) — one initial condition 
will be enough, as opposed to two initial conditions for the wave equation. Technically, 
our method will parallel that for the wave equation in Sec. 12.3: a separation of variables, 
followed by the use of Fourier series. You may find a step-by-step comparison worthwhile. 

Step 1 . Two ODEs from the heat equation (1). Substitution of a product 
u(x, t ) = F(x)G(t) into (1) gives FG = c 2 F"G with G = dGIdt and F" = d 2 F/dx z . To 
separate the variables, we divide by c 2 FG, obtaining 


(4) 



El 

F ' 


The left side depends only on t and the right side only on x, so that both sides must equal 
a constant k (as in Sec. 12.3). You may show that for k = 0 or k > 0 the only solution 
u = FG satisfying (2) is u = 0. For negative k = —p 2 we have from (4) 


G 


c 2 G 


F" 

F 


- _„2 


Multiplication by the denominators gives immediately the two ODEs 


(5) 


F" + p 2 F= 0 
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and 


(6) G + c 2 p 2 G = 0. 

Step 2. Satisfying the boundary conditions (2). We first solve (5). A general solution is 

(7) F(x) = A cos px + B sin px. 

From the boundary conditions (2) it follows that 

w(0, t) = F(0)G(0 = 0 and «(L, 0 = F(L)G(t) = 0. 

Since G = 0 would give u = 0, we require F(0) = 0, F(L) = 0 and get F(0) = A = 0 
by (7) and then F(L) = B sin pL = 0, with B ^ 0 (to avoid F = 0); thus. 


sin pL = 0, 


hence 


tlTT 


p = — , 7? = 1 , 2, * * * . 


Setting F = 1, we thus obtain the following solutions of (5) satisfying (2): 

ri7rx 


F n (x) = sin 


n = 1, 2, 


(As in Sec. 12.3, we need not consider negative integral values of n.) 

All this was literally the same as in Sec. 12.3. From now on it differs since (6) differs 
from (6) in Sec. 12.3. We now solve (6). For p = rnr/L , as just obtained, (6) becomes 


where 


G + A 2 G = 0 
It has the general solution 

G n (t) = B n e-*»\ 

where B n is a constant. Hence the functions 

(8) u n 0 t, t) = F n (x)G n (t) = B n sin 


A** 


Ctl'TT 


tlTTX x 2 . 
1 


n — 1 , 2 , 


{n = 1 , 2 , • • •) 


are solutions of the heat equation (1), satisfying (2). These are the eigenfunctions of the 
problem, corresponding to the eigenvalues A^ = cnir/L. 

Step 3 . Solution of the entire problem. Fourier series. So far we have solutions (8) 
satisfying the boundary conditions (2). To obtain a solution that also satisfies the initial 
condition (3), we consider a series of these eigenfunctions, 


(9) 


u(x, 0 = 2 u n( x > 0 = 2*n si 


nirx 


sin 




n=l 


( CHIT \ 
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EXAMPLE 1 


EXAMPLE 2 


From this and (3) we have 

^ riTT.x 

u(x, 0) = X B n sin — — = f(x). 

71=1 L 

Hence for (9) to satisfy (3), the B n y s must be the coefficients of the Fourier sine series, 
as given by (4) in Sec. 1 1.3; thus 


( 10 ) 



(n = 1, 2, • • •)• 


The solution of our problem can be established, assuming that f(x) is piecewise 
continuous (see Sec. 6.1) on the interval 0 ^ x ^ L and has one-sided derivatives (see 
Sec. 11.1) at all interior points of that interval; that is, under these assumptions the series 
(9) with coefficients (10) is the solution of our physical problem. A proof requires 
knowledge of uniform convergence and will be given at a later occasion (Probs. 19, 20 
in Problem Set 15.5). 

Because of the exponential factor, all the terms in (9) approach zero as t approaches 
infinity. The rate of decay increases with n. 

Sinusoidal Initial Temperature 

Find the temperature u{x, /) in a laterally insulated copper bar 80 cm long if the initial temperature is 
100 sin (tta/ 80) °C and the ends are kept at 0°C. How long will it take for the maximum temperature in the 
bar to drop to 50°C? First guess, then calculate. Physical data for copper: density 8.92 gm/cm , specific heat 
0.092 cal/(gm °C), thermal conductivity 0.95 cal/(cm sec °C). 

Solution, The initial condition gives 

M T.V 7TX 

w(.v. 0) = 2j sin -gjp = fix) = 100 sin — . 

Hence, by inspection or from (9) we get Bi = 100. B 2 = B$ = • • • = 0. In (9) we need Aj 2 = c 2 ?? 2 //, 2 
where c 2 = Kfiap) = 0.95/(0.092 * 8.92) = 1.158 [cm 2 /sec]. Hence we obtain 

A] 2 - 1.158 -9.870/80 2 = 0.001785 [sec -1 ]. 


The solution (9) is 

Iitr, 0-100 sin ^ e -0 00178 * 

80 

Also, 100<r°- ool785t = 50 when t = (In 0.5)/(-0.00l785) = 388 [sec] ~ 6.5 [mini. Does your guess, or at 
least its order of magnitude, agree with this result? H 

Speed of Decay 

Solve the problem in Example 1 when the initial temperature is 100 sin (3 tt.y/ 80) °C and the other data are as 
before. 

Solution, in (9), instead of n - 1 we now have n = 3, and A 3 2 = 3 2 A! 2 = 9 * 0.001785 = 0.01607. so that 
the solution now is 

«(jr, t) = 100 sin ^ c -o.oieo7t 

80 

Hence the maximum temperature drops to 50°C in / = (In 0.5)/(— 0.01607) » 43 [seconds], which is much 
faster (9 times as fast as in Example 1 ; why?). 
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Had we chosen a bigger n, the decay would have been still faster, and in a sum or series of such terms, each 
term has its own rate of decay, and terms with large n are practically 0 after a very short time. Our next example 
is of this type, and the curve in Fig. 292 corresponding to t = 0.5 looks almost like a sine curve; that is, it is 
practically the graph of the first term of the solution. ■ 




K X 

Fig. 292. Example 3. Decrease of temperature 
with time t for L = i r and c = 1 


“Triangular” Initial Temperature in a Bar 

Find the temperature in a laterally insulated bar of length L whose ends are kept at temperature 0, assuming that 
the initial temperature is 

f jc if 0 < * < L/2 t 

/(*) = 

( L - x if LI2 < x < L. 


(The uppermost part of Fig. 292 shows this function for the special L = tt.) 
Solution . From (10) we get 

2 / f nirx f tiTTX 

(10*) 5 n = y I I -v sin ~r~ dx + I (L - x) sin —r~ dx 

L Vo L J U2 L / 

Integration gives B n = 0 if n is even, 


4 L 4 L 

B n= Ti (« = 1.5, 9, •••) and B n gT 

n tt n 7T 

(see also Example 4 in Sec. 1 1.3 with k = LIT). Hence the solution is 



(n = 



3, 7, 11, •••). 



Figure 292 shows that the temperature decreases with increasing r. because of the heat loss due to the cooling 
of the ends. 

Compare Fig. 292 and Fig. 288 in Sec. 12.3 and comment ■ 
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EXAMPLE 4 


EXAMPLE 5 


Bar with Insulated Ends. Eigenvalue 0 

Find a solution formula of (1). (3) with (2) replaced by the condition that both ends of the bar are insulated. 

Solution . Physical experiments show that the rate of heat flow is proportional to the gradient of the 
temperature. Hence if the ends .r = 0 and x = L of the bar are insulated, so that no heat can flow through the 
ends, we have grad u = u x = du/d x and the boundary conditions 

(2*) u x (0, t) = 0, u x (L y /) = 0 for all /. 

Since «(.v. /) = F(x)G(t). this gives « x (0, /) = F'(0)G(/) = 0 and u x {L, /) = F f (L)G(t) = 0. Differentiating (7), 
we have F f (x) = —Ap sin px + Bp cos px. so that 

F*( 0) = Bp = Q and then F\L) = — Ap sin pL = 0. 

The second of these conditions gives p = p n = utt/L , (/i = 0, 1, 2. • • •)• From this and (7) with A = 1 
and B = 0 we get F n (.t) = cos (/ittaVL) , (n = 0, 1, 2, • * •). With G n as before, tills yields the eigenfunctions 

(11) ««(•*. 0 = F n (x)G n (t) = A n cos — e A ” (« = 0, 1, • • •) 


corresponding to the eigenvalues A n = c/itt/L. The latter are as before, but we now have the additional eigenvalue 
A 0 = 0 and eigenfunction « 0 = const, which is the solution of the problem if the initial temperature f(x) is 
constant. This shows the remarkable fact that a separation constant can very well be zero , and zero can be an 
eigenvalue. 

Furthermore, whereas (8) gave a Fourier sine series, we now get from (11) a Fourier cosine series 


( 12 ) 



c/nr 

~ 


\ 


Its coefficients result from the initial condition (3), 

^ mrx 

uC v. 0) = 2 K cos —r~ = M. 

n=0 L 


in the form (2), Sec. 1 1 .3, that is. 

1 f L 2 f L rnrx _ 

(13) A° = — J fix) dx y A n = — J fix) cos dx, n * I, 2, • • • . ■ 


"Triangular” Initial Temperature in a Bar with Insulated Ends 

Find die temperature in the bar in Example 3, assuming that the ends are insulated (instead of being kept at 
temperature 0). 

Solution . For the triangular initial temperature. (13) gives A$ = L/4 and (see also Example 4 in Sec. 11.3 
with k = U2) 

2 [ f U2 mrx 

A »=i [J 0 xa * — dx + 

Hence the solution (12) is 


SL 

1 llTX f / 2c7T \ 2 

1 6m 

/ 6 err \ 2 

i i 

J 

[¥ cos ~ exp [- It; 

Zj + ^cos — exp - 





We see that the terms decrease with increasing /. and u — » L/4 as t »: this is the mean value of the initial 
temperature. This is plausible because no heat can escape from this totally insulated bar. In contrast, the cooling 
of the ends in Example 3 led to heat loss and u — ► 0. the temperature at which the ends were kept. M 
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Steady Two-Dimensional Heat Problems. 

Laplace's Equation 

We shall now extend our discussion from one to two space dimensions and consider the 
two-dimensional heat equation 


On 

"a7 


= c 2 V 2 u = c 2 


d 2 u d 2 u \ 
dx 2 + dy 2 J 


for steady (that is, time-independent) problems. Then Buldt = 0 and the heat equation 
reduces to Laplace’s equation 


(14) 


V 2 u = 


d 2 u d 2 U 

+ a7 


= 0 


(which has already occurred in Sec. 10.8 and will be considered further in 
Secs. 12.7-12.10). A heat problem then consists of this PDE to be considered in some 
region R of the A*v-plane and a given boundary condition on the boundary curve C of R. 
This is a boundary value problem (BVP). One calls it: 


First BVP or Dirichlet Problem if u is prescribed on C (“Dirichlet boundary 
condition”) 

Second BVP or Neumann Problem if the normal derivative u n = dufdn is 
prescribed on C (“Neumann boundary condition”) 

Third BVP, Mixed BVP, or Robin Problem if it is prescribed on a portion of C 
and u n on the rest of C (‘‘Mixed boundary condition”). 


y 

u = fix) 



0 

R 

u-0 



u = 0 


X 


Fig. 293. Rectangle R and given boundary values 


Dirichlet Problem in a Rectangle R (Fig. 293). We consider a Dirichlet problem for 
Laplace’s equation (14) in a rectangle /?, assuming that the temperature w(*, y ) equals a 
given function /( x) on the upper side and 0 on the other three sides of the rectangle. 

We solve this problem by separating variables. Substituting u( a\ y) = F(x)G(y) into 
(14) written as u ^ = — u yy , dividing by FG , and equating both sides to a negative constant, 
we obtain 
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From this we get 


1 d 2 F _ 1 d 2 G 

F* dx 2 G ’ dy 2 


2 + kF 0, 


= —k. 


and the left and right boundary conditions imply 

F(0) = 0, and F(a) = 0. 
This gives k = ( mr/a ) 2 and corresponding nonzero solutions 


(15) F(x) = F n (x) = sin — 

a 

The ODE for G with k = ( wr/a ) 2 then becomes 


n = 1, 2, 


Solutions are 


d 2 G I n 7rV 

\T; 


G = 0. 


G(y) = G n (y) = A n e n ^ /a + B n e~ n ^ a , 


Now the boundary condition u = 0 on the lower side of R implies that G n ( 0) = 0; that 
is, G n ( 0) = A n + B n = 0 or B n = -A n . This gives 

G n (y) = A„(e n ^ /a - e - n ^ /a ) = 2A n sinh ^ . 

From this and (15), writing 2 A n = A£, we obtain as the eigenfunctions of our problem 

(16) u n (x, y ) = F n (x)G n (y) = A* sin sinh — . 

a a 

These solutions satisfy the boundary condition u = 0 on the left, right, and lower sides. 

To get a solution also satisfying the boundary condition u(x , b) = f(x) on the upper 
side, we consider the infinite series 

oc 

u(x, y) = 2 “«(*> y)- 


From this and (16) with y = b we obtain 

^ * n7rx nirb 

u(x, b) = fix) = 2j Kx sin sinh . 

„ i a a 

n=l 


We can write this in the form 


, , v / * /27T& \ 777TX 

w(a:, b) = 2 j Mn sinh I sin . 

n=l ' a ) a 
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This shows that the expressions in the parentheses must be the Fourier coefficients b n of 
fix); that is, by (4) in Sec. 1 1.3, 

* nirb 2 r a mrx 

b n = A n sinh = — f(x) sin dx. 

a a j q a 

From this and (16) we see that the solution of our problem is 

^ ^ * mrx niry 

(17) u(x\ y) = 2j >0 = 2j K sm sinh 

a ci 


where 


(18) 


At = 


a sinh ( nrrbla ) J o 


f fix) si: 

•* e\ 


flTTX 

sin dx. 


We have obtained this solution formally, neither considering convergence nor showing 
that the series for u, w UT , and u yy have the right sums. This can be proved if one assumes 
that / and f r are continuous and f" is piecewise continuous on the interval 0 ^ a ^ ci. 
The proof is somewhat involved and relies on uniform convergence. It can be found in 
[C4] listed in App. 1. 

Unifying Power of Methods. Electrostatics, Elasticity 

The Laplace equation (14) also governs the electrostatic potential of electrical charges in 
any region that is free of these charges. Thus our steady-state heal problem can also be 
interpreted as an electrostatic potential problem. Then (17), (18) is the potential in the 
rectangle R when the upper side of R is at potential /(*) and the other three sides are 
grounded. 

Actually, in the steady-state case, the two-dimensional wave equation (to be considered 
in Secs. 12.7, 1 2.8) also reduces to (14). Then ( 1 7), ( 1 8) is the displacement of a rectangular 
elastic membrane (rubber sheet, drumhead) that is fixed along its boundary, with three 
sides lying in the Ay-plane and the fourth side given the displacement fix). 

This is another impressive demonstration of the unifying power of mathematics. It 
illustrates that entirely different physical systems may have the same mathematical model 
and can thus be treated by the same mathematical methods. 




1. WRITING PROJECT. Wave and Heat Equations. 

Compare the two PDEs with respect to type, general 
behavior of eigenfunctions, and kind of boundary and 
initial conditions and resulting practical problems. Also 
discuss the difference between Figs. 288 in Sec. 12.3 
and 292. 


2. (Eigenfunctions) Sketch (or graph) and compare the 
first three eigenfunctions (8) with B n - 1, c = 1, 
L ~ tt for / = 0, 0.2, 0.4, 0.6, 0.8, 1 .0. 

3. (Decay) How does the rate of decay of (8) with fixed 
n depend on the specific heat, the density, and the 
thermal conductivity of the material? 
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4. If the first eigenfunction (8) of the bar decreases to half 
its value within 10 sec, what is the value of the 
diffusivity? 

5^9] LATERALLY INSULATED BAR 

A laterally insulated bar of length 10 cm and constant 
cross-sectional area 1 cm 2 , of density 10.6 gm/cm 3 , thermal 
conductivity 1.04cal/(cm sec °C), and specific heat 
0.056 cal/(gm °C) (this corresponds to silver, a good heat 
conductor) has initial temperature fix) and is kept at 0°C 
at the ends x = 0 and a = 10. Find the temperature u( a, t) 
at later times. Here, /(a) equals: 

5. /(a) = sin OA'irx 

6 . /(a) = sin 0.1 tta + 5 sin 0.2 nx 

7. /(a) = 0.2a if 0 < a < 5 and 0 otherwise 

8. /(a) = 1 - 0.2|a - 5 1 

9. /(a) = a if 0 < a < 2.5, /(a) = 2.5 if 2.5 < a < 7.5, 
/(a) = 10 - a if 7.5 < a < 10 

10. (Arbitrary temperatures at ends) If the ends a = 0 
and a = L of the bar in the text are kept at constant 
temperatures U y and f/ 2 * respectively, what is the 
temperature h x (a) in the bar after a long time 
(theoretically, as t— > <*)? First guess, then calculate. 

11. In Prob. 10 find the temperature at any time. 

12. (Changing end temperatures) Assume that the ends 
of the bar in Probs. 5-9 have been kept at 1 00°C for a 
long time. Then at some instant, call it / = 0, the 
temperature at a = L is suddenly changed to 0°C and 
kept at 0°C, whereas the temperature at x = 0 is kept 
at 100°C. Find the temperature in the middle of the bar 
at t = 1, 2, 3, 10, 50 sec. First guess, then calculate. 

BAR UNDER ADIABATIC CONDITIONS 

“Adiabatic” means no heat exchange with the 
neighborhood, because die bar is completely insulated, also 
at the ends. Physical Information: The heat flux at the ends 
is proportional to the value of duldx there. 

13. Show that for the completely insulated bar, 
« T (0, /) = 0, « a .(L, /) = 0, m(a, /) = /(a) and separation 
of variables gives the following solution, with A n given 
by (2) in Sec. 1 1.3. 

ll( A, t) — Aq + 2 C °S ~T~ e~ {cnJ!r/Lrt 

n=l L 

14-19 Find the temperature in Prob. 13 with L — tv, 
c = 1, and 

14. /(a) = a 15. /(.v) = 1 

16. fix) = 0.5 cos 4a 17. /(a) = tt 2 - a 2 

18. f(x) = \tt - \x - £tt| 19. fix) = (a - l*) 2 

20. Find the temperature of the bar in Prob. 13 if the left 

end is kept at 0°C, the right end is insulated, and the 
initial temperature is U 0 = const. 


21. The boundary condition of heat transfer 

(19) -u x iir, t) = k[uiTr, t) - w 0 ] 

applies when a bar of length tt with c = 1 is laterally 
insulated, the left end x = 0 is kept at 0°C, and at the 
right end heat is flowing into air of constant 
temperature u 0 . Let k = 1 for simplicity, and « 0 = 0* 
Show that a solution is u( a, /) = sin px e~ p *, where 
p is a soludon of tanp7r = — p. Show graphically 
that this equation has infinitely many positive solutions 
Pi. P 2 , Ps. ■ ■ • . where p n > n - % and 
lim ( p n — n + i) = 0. (Formula (19) is also known 

n— * oo 

as radiation boundary condition, but this is 
misleading; see Ref. [C3], p. 19.) 

22. (Discontinuous /) Solve (1), (2), (3) with L - 7r 
and fix) = U 0 = const (=£ 0) if 0 < a < ^r/2, 
fix) = 0 if 7T/2 < a < 7T. 

23. (Heat flux) The heat flux of a soludon w(x, t) across 
a = 0 is defined by <f>it) = —Ku x ( 0, t). Find <£(r) for 
the solution (9). Explain the name. Is it physically 
understandable that goes to 0 as t — > “? 

OTHER HEAT EQUATIONS 

24. (Bar with heat generation) If heat is generated at a 
constant rate throughout a bar of length L = tt with 
initial temperature fix) and the ends at a = 0 and 
7 r are kept at temperature 0, the heat equation is 
u t = c 2 u xx + H with constant H > 0. Solve this 
problem. Hint. Set u = v - Hx(x — 7t)/(2c 2 ). 

25. (Convection) If heat in the bar in the text is free to 
flow through an end into the surrounding medium 
kept at 0°C, the PDE becomes v t = c 2 v xx - j3u. Show 
that it can be reduced to the form (1) by setting 
u(a, t) = m(a, /)w(/). 

26. Consider v t = c 2 v xx — v (0 < a < L, t > 0), 
u(0, t) = 0, i/(L, t) = 0, u(a, 0) = f{x), where the term 
—v models heat transfer to the surrounding medium 
kept at temperature 0. Reduce this PDE by setting 
u(a, 0 = u{ a, t)wit) with w such that u is given by (9), 
(10). 

27. (Nonhomogeneous heat equation) Show that the 
problem modeled by 

*t ~ c 2 “xx = Ne~ ax 

and (2), (3) can be reduced to a problem for the 
homogeneous heat equation by setting 

//(a, /) = i/(a, t) + wix) 

and determining w so that v satisfies the homogeneous 
PDE and the conditions u(0, t) = u(L, t) = 0, 
u(a, 0) = fix) - w(x), (The term Ne~ ax may represent 
heat loss due to radioactive decay in the bar.) 
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28-35 1 TWO-DIMENSIONAL PROBLEMS 

28. (Laplace equation) Find the potential in the rectangle 
0 = a' = 20, 0 = v ^ 40 whose upper side is kept at 
potential 220 V and whose other sides are grounded. 

29. Find die potential in the square 0 ^ a ^ 2, 0 ^ v = 2 
if the upper side is kept at the potential sin \ttx and the 
other sides are grounded. 

30. CAS PROJECT. Isotherms. Find the steady-state 
solutions (temperatures) in the square plate in Fig. 294 
with a = 2 satisfying the following boundary 
conditions. Graph isotherms. 

(a) u — sin ttx on the upper side, 0 on the others. 

(b) u = 0 on the vertical sides, assuming that the other 
sides are perfectly insulated. 

(c) Boundary conditions of your choice (such that the 
solution is not identically zero). 


a 


Fig. 294. Square plate 


31. (Heat flow in a plate) The faces of the thin square 
plate in Fig. 294 with side a = 24 are perfectly 
insulated. The upper side is kept at 20°C and the other 
sides are kept at 0°C. Find the steady-state temperature 
u( a\ y) in the plate. 

32. Find the steady-state temperature in the plate in Prob. 
3 1 if the lower side is kept at C/ 0 °C, the upper side at 
U i°C, and the other sides are kept at 0°C. Hint: Split 
into two problems in which the boundary temperature 
is 0 on three sides for each problem. 

33. (Mixed boundary value problem) Find the steady- 
state temperature in the plate in Prob. 31 with the upper 
and lower sides perfectly insulated, the left side kept 
at 0°C, and the right side kept at f(y)° C. 

34. (Radiation) Find steady-state temperatures in the 
rectangle in Fig. 293 with the upper and left sides 
perfectly insulated and the right side radiating into a 
medium at 0°C according to u x (a, y) 4* lnt(a, y) = 0, 
It > 0 constant. (You will get many solutions since no 
condition on the lower side is given.) 

35. Find formulas similar to (17), (18) for the temperature 
in the rectangle R of the text when the lower side of R 
is kept at temperature f(x) and the other sides are kept 
at 0°C. 


12.6 Heat Equation: Solution by 

Fourier Integrals and Transforms 

Our discussion of the heat equation 


( 1 ) 


du _ 2 
dt ~ ° dx 2 


in the last section extends to bars of infinite length, which are good models of very long 
bars or wires (such as a wire of length, say, 300 ft). Then the role of Fourier series in the 
solution process will be taken by Fourier integrals (Sec. 1 1.7). 

Let us illustrate the method by solving (1) for a bar that extends to infinity on both 
sides (and is laterally insulated as before). Then we do not have boundary conditions, but 
only the initial condition 

(2) u(x, 0) = f(x) (-00 < x < 00 ) 

where f(x) is the given initial temperature of the bar. 

To solve this problem, we start as in the last section, substituting u(x f t) = F{x)G(t) 
into (1). This gives the two ODEs 


( 3 ) 


F" + p 2 F = 0 


[see (5), Sec. 12.5] 




SEC 116 Heat Equation: Solution by Fourier Integrals and Transforms 563 

and 

(4) C + c 2 p 2 G = 0 [see (6), Sec. 12.5]. 

Solutions are 


Fix) = A cos px 4- B sin px and G(t) = e~ c2p2t , 

respectively, where A and B are any constants. Hence a solution of (1) is 

(5) u(x, t\ p) = FG = ( A cos px + B sin px)e~ 6 * p2t . 

Here we had to choose the separation constant k negative, k = —p 2 , because positive 
values of k would lead to an increasing exponential function in (5), which has no physical 
meaning. 

Use of Fourier Integrals 

Any series of functions (5), found in the usual manner by taking p as multiples of a fixed 
number, would lead to a function that is periodic in x when t = 0. However, since fix) 
in (2) is not assumed to be periodic, it is natural to use Fourier integrals instead of Fourier 
series. Also, A and B in (5) are arbitrary and we may regard them as functions of /?, writing 
A = A(p) and B = B(p). Now, since the heat equation (1) is linear and homogeneous, 
the function 

r°° 

t\ p) clp = I [A(p) cos px + B(p) sin px] e~° 2p2i dp 
J o 



is then a solution of (1), provided this integral exists and can be differentiated twice with 
respect to x and once with respect to /. 

Determination of A(p) and B(p) from the Initial Condition. From (6) and (2) we get 

(7) u(x, 0) = f [A(p) cos px + Bip) sin px] dp = /(*). 

-'o 


This gives A(p) and B(p) in terms of fix): indeed, from (4) in Sec. 11.7 we have 

1 r 50 1 r 00 

(8) Aip) = — fiu) cos pv dv , B{p) = — I /(u) sin pv dv. 

7T •'-co 7T -Loo 


According to (1*), Sec. 11.9, our Fourier integral (7) with these A(p) and Bip) can be 
written 


ui x t 0 ) = 


7 T 



fiv) cos ipx — pv) dv 




dp. 


Similarly, (6) in this section becomes 



564 


CHAP. 12 Partial Differential Equations (PDEs) 


EXAMPLE 1 


Assuming that we may reverse the order of integration, we obtain 

1 oc r oo ~ | 

(9) w(a, t) = — J f(v) I J e _c2p2£ cos ( px — pv) dp\ do. 


Then we can evaluate the inner integral by using the formula 

f 00 o2 V7 r , 2 

(10) j e ^ cos 2bs ds = — — e b . 


[A derivation of (10) is given in Problem Set 16.4 (Team Project 28).] This takes the form 
of our inner integral if we choose p = s!(c^/t) as a new variable of integration and set 


b = 


A* — V 

2cVt ’ 


Then lbs = (x — v)p and ds = cVr dp , so that (10) becomes 


f e ° Zp2t cos (/?a - pv) dp = exp 

2c V / 


(a - of 


4c 2 / 


By inserting this result into (9) we obtain the representation 

(a - 


4C 


£}*■ 


(11) u(x, t ) = J f(v) ex P 

Taking z = (v - a)/(2 cV/) as a variable of integration, we get the alternative form 

1 r x 

( 12 ) u( a , /) = f(x + 2czVt) e -* dz. 

V7 T J -oo 


If /(a) is bounded for all values of a and integrable in every finite interval, it can be 
shown (see Ref. [CIO]) that the function (11) or (12) satisfies (1) and (2). Hence this 
function is the required solution in the present case. 


Temperature in an Infinite Bar 


Find the temperature in the infinite bar if the initial temperature is (Fig. 295) 


/tv) = 


U 0 = const 
0 


if H < 1, 
if H > i. 



fix) 

1 


u* 



-] 

l 

1 * 


Fig. 295. Initial temperature in Example 1 
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EXAMPLE 2 


Solution. From ( 1 1 ) we have 

t!„ [■' ( . 

7rr 

If we introduce the above variable of integration z. then the integration over v from — 1 to 1 corresponds to the 
integration over z from (- 1 - .v)/(2 t*V7) to (1 - .x)/(2c V/), and 

n -x)i(2cVo 

(13) u(x, ,)= JljL I dz (/ > 0). 

V 7T •'—ci +x)/(2c\/iy 

We mention that this integral is not an elementary function, but can be expressed in terms of the error function, 
whose values have been tabulated. (Table A4 in App. 5 contains a few values; larger tables are listed in 
Ref. [GR1J in App. 1. See also CAS Project 10. p. 568.) Figure 296 shows u(x. t ) for U 0 = 100°C, 
c 2 = 1 cm 2 /sec, and several values of /. M 



Fig. 296. Solution u(x, t) in Example 1 for U 0 = 100°C, 
c 2 = 1 cm 2 /sec f and several values of t 

Use of Fourier Transforms 

The Fourier transform is closely related to the Fourier integral, from which we obtained 
the transform in Sec. 1 1 .9. And the transition to the Fourier cosine and sine transform in 
Sec. 11.8 was even simpler. (You may perhaps wish to review this before going on.) 
Hence it should not surprise you that we can use these transforms for solving our present 
or similar problems. The Fourier transform applies to problems concerning the entire axis, 
and the Fourier cosine and sine transforms to problems involving the positive half-axis. 
Let us explain these transform methods by typical applications that fit our present 
discussion. 

Temperature in the Infinite Bar in Example 1 

Solve Example I using ihe Fourier transform. 

Solution. The problem consists of the heat equation ( 1 ) and the initial condition (2), which in this example is 
fix) = U 0 = const if |.v| < 1 and 0 otherwise. 

Our strategy is to take the Fourier transform with respect to .r and then to solve the resulting ordinary DE in t. 
The details are as follows. 
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Let u = &{ii) denote the Fourier transform of u, regarded as a function ofx. From (10) in Sec. 1 1.9 we see 
that the heat equation ( I ) gives 

&(u t ) = c 2 &{u X3 J = c 2 (-m* 2 )SF(m) = -c 2 w 2 ii. 

On the left, assuming that we may interchange the order of differentiation and integration, we have 




!_ f 

V2tt J- 


u t e 


- nox dx = 


\Zl7T 


-00 

a, J_ y 


du 

~™ x dx = — . 

to 


Thus 


2 2 * 

— = — C W ll. 

dt 

Since this equation involves only a derivative with respect to t but none with respect to w, this is a first-order 
ordinary DE, with / as the independent variable and w as a parameter. By separating variables (Sec. 1.3) we 
get the general solution 

u(w. t ) = 

with the arbitrary “constant” C(w) depending on the parameter w. The initial condition (2) yields the relationship 
m(w, 0) = CM = fM = ^(/). Our intermediate result is 


u(w t t ) = f(\v)e 




The inversion formula (7), Sec. 1 1 .9, now gives the solution 

(14) u(Jf - ,)= vbi 

In this solution we may insert the Fourier transform 


hw)e~ M e iwx dw. 


1 f 


f{v)e~ 


} dv. 


Assuming that we may invert the order of integration, we then obtain 

"Cv, t) = ^ J f(v) [ f e-Mjiwx-wv) rf ,„j dv 

By the Euler formula (3). Sec. 1 1.9, the integrand of the inner integral equals 

e -cVt co$ ^ wx _ + j e ~c 2 w 2 t s j n ^ wx _ wv ) 

We see that its imaginary part is an odd function of w, so that its integral is 0. (More precisely, this is the 
principal part of the integral; see Sec. 16.4.) The real part is an even function of w, so that its integral from 
-oo to <« equals twice the integral from 0 to 


«(*, 0 = - f m \ 

77 J -oo L J 0 


cos ^ wx _ wv ^ 


■ dv. 


This agrees with (9) (with p = w ) and leads to the further formulas (1 1) and (13). 

Solution in Example 1 by the Method of Convolution 

Solve the heat problem in Example 1 by the method of convolution. 

Solution . The beginning is as in Example 2 and leads to (14), that is. 


(15) 


«(*,,)= <lw. 
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EXAMPLE 4 


Now comes the crucial idea. We recognize that this is of the form (13) in Sec. 1 i.9, that is* 
( 1 6) Hfc /) = (/ * g)(x) = j /(w)|( w)e Uox dw 


where 
(17) 

Since, by the definition of convolution [(II), Sec. 1 1.9], 


iM ~ ° Wt - 


(18) 


(/ * £>« 


-f. 


f(p)8(x - p) dp. 


as our next and last step we must determine the inverse Fourier transform g of g. For this we can use formula 
9 in Table m of Sec. 11.10, 

me-™?) = -4= <r w2/<4a> 

V2 a 

with a suitable a. With c 2 / = l/(4a) or a = l/(4c 2 r), using (17) we obtain 

g^-zWO) = = V2^ V2^i(w). 

Hence | has the inverse 


I 




-.r 2 /(4c 2 f) 


V2c 2 r V27T 

Replacing .v with .v — p and substituting this into (18) we finally have 


(19) 


I f ^ f (* ” P) 2 1 

,) = (/* *)(,) = j jfp) exp (- ~ 1?r } 


This solution formula of our problem agrees with (1 1). We wrote (/ * g)(.v), without indicating the parameter t 
with respect to which we did not integrate. H 

Fourier Sine Transform Applied to the Heat Equation 

If a laterally insulated bar extends from x = 0 to infinity, we can use the Fourier sine transform. We let the 
initial temperature be «(*, 0) = f{x) and impose the boundary condition «(0, t) = 0. Then from the heat equation 
and (9b) in Sec. 1 1.8, since /( 0) = w( 0, 0) = 0, we obtain 

= = c2 &s( u xx) = -c 2 w 2 & $ (u) = -c 2 w 2 u $ (w, /). 


This is a first-order ODE dujdt + c 2 w?u s = 0. Its solution is 


k$(h\ t) — C{w)e~ 




From the initial condition u{x, 0) = fix) we have u s (w, 0) = f s (\v) = C(w). Hence 

« s (w. t) = f s (w)e~ d ‘ u ’\ 

Taking the inverse Fourier sine transform and substituting 

IT r® 

= V "ir J 0 ^ S “ Wp dp 


/«(»•') 
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on the right, we obtain the solution formula 

n oc 

/(/;) sin wp e ~ c2w2t sin mv dp dw. 

j 

Figure 297 shows (20) with c — I for fix) = I if 0 ^ x ^ 1 and 0 otherwise, graphed over the .tf-plane for 
0 ^ 2. 0.01 ^ t ^ 1.5. Note that the curves of /i(.v, t) for constant r resemble those in Fig. 296 on p. 565. H 


H 



Fig. 297. Solution (20) in Example 4 




[~i— 7] SOLUTION IN INTEGRAL FORM 

Using (6), obtain the solution of (1) in integral form 
satisfying the initial condition u ( *, 0) = /(*), where 

1. f(x) = l if |.v| < a and 0 otherwise 

2. /(*) = e~ k W ( k > 0) 

3. fix) = 1/(1 + .v 2 ). [Use (15) in Sec. 11.7.] 

4. fix) = (sin *)/*. [Use Prob. 4 in Sec. 1 1.7J 

5. fix) = (sin ttx)/x. [Use Prob. 4 in Sec. 1 1.7.] 

6. fix) = x if |*| < l and 0 otherwise 

7. /(*) = |*| if |*| < 1 and 0 otherwise. 

8. Verify that u in Prob. 5 satisfies the initial condition. 

9. CAS PROJECT. Heat Flow, (a) Graph the basic 
Fig. 296. 

(b) In (a) apply animation to “see” the heat flow in 
terms of the decrease of temperature. 

(c) Graph u(x, t ) with c = I as a surface over the upper 
*/-half-plane. 

10. CAS PROJECT. Error Function 

(21) erf .v = f e~ w * dw 

W 0 


This function is important in applied mathematics 
and physics (probability theory and statistics, 
thermodynamics, etc.) and fits our present discussion. 
Regarding it as a typical case of a special function 
defined by an integral that cannot be evaluated as in 
elementary calculus, do the following. 

(a) Sketch or graph the bell-shaped curve [the curve 
of the integrand in (2I)J. Show that erf* is odd. Show 
that 

J e~ u,Z dw = ^ 

fk 

I e~ w 2 dw = V7rerfZ>. 

J -b 

(b) Obtain the Maclaurin series of erf* from that 
of the integrand. Use that series to compute a table of 
erf* for * = 0(0.01)3 (meaning * = 0, 0.01, 0.02, 
• • • , 3). 

(c) Obtain the values required in (b) by an integration 
command of your CAS. Compare accuracy. 

(d) It can be shown that erf (so) = 1 . Confirm this 
experimentally by computing erf* for large *. 
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(e) Let fix) = 1 when a* > 0 and 0 when a < 0. Using 
erf (<*) = L show that (12) then gives 



(f) Express the temperature (13) in terms of the error 
function. 



Here, the integral is the definition of the “distribution 
function of the normal distribution” to be discussed in 
Sec. 24.8. 


12 ." Modeling: Membrane, 

Two-Dimensional Wave Equation 

The vibrating string in Sec. 12.2 is a basic one-dimensional vibrational problem. Equally 
important is its two-dimensional analog, namely, the motion of an elastic membrane, such 
as a drumhead, that is stretched and then fixed along its edge. Indeed, setting up the model 
will proceed almost as in Sec. 12.2. 

Physical Assumptions 

1. The mass of the membrane per unit area is constant (“homogeneous membrane”). 
The membrane is perfectly flexible and offers no resistance to bending. 

2. The membrane is stretched and then fixed along its entire boundary in the Ay-plane. 
The tension per unit length T caused by stretching the membrane is the same at all 
points and in all directions and does not change during the motion. 

3. The deflection u{ a, y, t) of the membrane during the motion is small compared to 
the size of the membrane, and all angles of inclination are small. 

Although these assumptions cannot be realized exactly, they hold relatively accurately for 
small transverse vibrations of a thin elastic membrane, so that we shall obtain a good 
model, for instance, of a drumhead. 

Derivation of the PDE of the Model (“Two-Dimensional Wave Equation”) from 
Forces. As in Sec. 12.2 the model will consist of a PDE and additional conditions. The 
PDE will be obtained by the same method as in Sec. 12.2, namely, by considering the 
forces acting on a small portion of the physical system, the membrane in Fig. 298 on the 
next page, as it is moving up and down. 

Since the deflections of the membrane and the angles of inclination are small, the sides 
of the portion are approximately equal to A a and Ay. The tension T is the force per unit 
length. Hence the forces acting on the sides of the portion are approximately T Aa and 
T Ay. Since the membrane is perfectly flexible, these forces are tangent to the moving 
membrane at every instant. 

Horizontal Components of the Forces. We first consider the horizontal components 
of the forces. These components are obtained by multiplying the forces by the cosines of 
the angles of inclination. Since these angles are small, their cosines are close to 1. Hence 
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Fig. 298. Vibrating membrane 


the horizontal components of the forces at opposite sides are approximately equal. 
Therefore, the motion of the particles of the membrane in a horizontal direction will be 
negligibly small. From this we conclude that we may regard the motion of the membrane 
as transversal; that is, each particle moves vertically. 

Vertical Components of the Forces. These components along the right side and the 
left side are (Fig. 298), respectively, 

T A y sin f3 and — T Ay sin a. 

Here a and /3 are the values of the angle of inclination (which varies slightly along the 
edges) in the middle of the edges, and the minus sign appears because the force on the 
left side is directed downward. Since the angles are small, we may replace their sines by 
their tangents. Hence the resultant of those two vertical components is 

T Ay (sin f3 — sin a) T Ay (tan /3 — tan a) 

= T A y [u x (x + Ax, y t ) - u x (x, y 2 )] 

where subscripts * denote partial derivatives and y 1 and y 2 are values between y and 
y 4- Ay. Similarly, the resultant of the vertical components of the forces acting on the 
other two sides of the portion is 

(2) T Ax [«„(*!, y + Ay) - u y (x 2 , >>)] 

where jc, and x 2 are values between x and x + Ax. 

Newton’s Second Law Gives the PDE of the Model. By Newton’s second law (see 
Sec. 2.4) the sum of the forces given by (1) and (2) is equal to the mass pAA of that small 
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portion times the acceleration d 2 u/dt 2 ; here p is the mass of the undeflected membrane 
per unit area, and A A = Ax Ay is the area of that portion when it is undeflected. Thus 

d 2 u 

P A.v Ay = T Ay [ujx + Ax, y x ) - u x (x, y 2 )] 

+ T Ax [uyCxj, y + Ay) - u y (x 2 , y)] 

where the derivative on the left is evaluated at some suitable point (x, y) corresponding 
to that portion. Division by p Ax Ay gives 


d 2 u _ T- u x Q 
dt 2 p 


x + Ax. yi) - u x (x, y 2 ) Uyfa, y + Ay) - u y (x 2 , y) 
A.v Ay 


If we let Ax and Ay approach zero, we obtain the PDE of the model 
(3) 


d 2 lt 2 / d 2 u d 2 U \ 

IF ~ c iiF + 1FJ 


T 

P ' 


This PDE is called the two-dimensional wave equation. The expression in parentheses 
is the Laplacian V 2 « of u (Sec. 10.8). Hence (3) can be written 


(3') 


d z u 

dt 2 


= c 2 V 2 u. 


Solutions of the wave equation (3) will be obtained and discussed in the next section. 


12.8 Rectangular Membrane. 

Double Fourier Series 

The model of the vibrating membrane for obtaining the displacement u(x, y, t ) of a point 
( x , y) of the membrane from rest (« = 0) at time t is 


(1) 

d 2 u 2 / r) 2 w d 2 u 

dt 2 C \ dx 2 + dy 2 

(2) 

u = 0 on the boundary 

(3a) 

u(x, y, 0) = f{x, y) 

(3b) 

u t (x, y, 0) = g(x, y). 


Here (1) Ls the two-dimensional wave equation with c 2 = Tip just derived, (2) is the 
boundary condition (membrane fixed along the boundary in the jry-plane for all times 
t § 0), and (3) are the initial conditions at t = 0, consisting of the given initial 
displacement (initial shape) /(. x, y) and the given initial velocity g{x, y), where u t = dti/dt. 
We see that these conditions are quite similar to those for the string in Sec. 12.2. 
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.v 

b 

R 

a x 

Fig. 299. Rectangular membrane 

As a first important model, let us consider the rectangular membrane R in Fig. 299, 
which is simpler than the circular drumhead to follow. Then the boundary in (2) is the 
rectangle in Fig. 299. We shall solve this problem in three steps: 

Step 1 . By separating variables, setting u(x, y, /) = F(x, y)G(t) and later F( x, y) = H(x)Q(y) 
we obtain from (1) an ODE (4) for G and later from a PDE (5) for F two DDEs (6) and 
(7) for H and Q. 

Step 2 . From the solutions of those ODEs we determine solutions (13) of (1) 
(“eigenfunctions” that satisfy the boundary condition (2). 

Step 3. We compose the u mn into a double series (14) solving the whole model (1), (2), (3). 

Step 1. Three ODEs From the Wave Equation (1) 

To obtain ODEs from (1), we apply two successive separations of variables. In the first 
separation we set w(.v, y, 0 = Fix, y)G(f). Substitution into (1) gives 

FG = c 2 (F„G + FyyG) 

where subscripts denote partial derivatives and dots denote derivatives with respect to t. 
To separate the variables, we divide both sides by c 2 FG: 

G l 

7c = 7 (F ~ + 

Since the left side depends only on t, whereas the right side is independent of t , both sides 
must equal a constant. By a simple investigation we see that only negative values of that 
constant will lead to solutions that satisfy (2) without being identically zero; this is similar 
to Sec. 12.3. Denoting that negative constant by - v 2 , we have 

5 1 2 
~#q = J ( F «c + F w) = ~ V ■ 

This gives two equations: for the “time function” G(r) we have the ODE 

(4) G + A 2 G = 0 where A = cv , 

and for the “amplitude function” F(x, y) a PDE, called the two-dimensional Helmholtz 3 
equation 

(5) F xx + Fyy + v 2 F = 0. 


3 HERMANN VON HELMHOLTZ (1821-1894), German physicist, known for his basic work in 
thermodynamics, fluid flow, and acoustics. 
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Separation of the Helmholtz equation is achieved if we set F(a\ y) = H(x)Q(y). By 
substitution of this into (5) we obtain 


d 2 H 

clx 2 


Q 


-( w 0 + H- 


To separate the variables, we divide both sides by HQ, finding 


I d 2 H 
H lx 2 


Q 


(f+4 


Both sides must equal a constant, by the usual argument. This constant must be negative, 
say, — k 2 > because only negative values will lead to solutions that satisfy (2) without being 
identically zero. Thus 


1 d z H 
H dx 2 


Q 


(fQ 
\ dy 2 



-k 2 . 


This yields two ODEs for H and Q, namely, 


( 6 ) 


d 2 H 

dx 2 


+ k 2 H 


0 


and 

d 2 0 

(7) ~dy^~ + ~ ® where p 2 = v 2 - k 2 . 


Step 2. Satisfying the Boundary Condition 

General solutions of (6) and (7) are 

H(x) = A cos kx + B sin kx and Q(y) = C cos py H- D sin py 

with constant A, B, C, D. From u = FG and (2) it follows that F = HQ must be zero on 
the boundary, that is, on the edges x = 0, x = a, y = 0, y = b; see Fig. 299. This gives 
the conditions 

H{ 0) = 0, H(a) = 0, £2(0) = 0, Q(b) = 0. 

Hence //( 0) = A = 0 and then H(d) = B sin ka = 0. Here we must take B ¥= 0 since 
otherwise H(x) = 0 and F(x, y) = 0. Hence sin ka = 0 or ka = imr, that is, 

jjitt 

k = (m integer). 
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In precisely the same fashion we conclude that C = 0 and p must be restricted to the 
values p = rnrlb where n is an integer. We thus obtain the solutions H = H mt Q = Q nf 
where 


H m (x) = sin 


ITITTX 

a 


and 


Q n (y) = sin 


nuy 

~T' 


m = 1, 2, • • • , 
n = 1, 2, • • • . 


As in the case of the vibrating string, it is not necessary to consider m>n = — 1, —2, • • • 
since the corresponding solutions are essentially the same as for positive m and n y except 
for a factor — 1 . Hence the functions 


ftlTTX MTV m “ 1 » 2 , * ‘ , 

(8) F mn (x , y) = H m (x)Q n (y) = sin sin — , 

a b n = 1 , 2, • • • , 

are solutions of the Helmholtz equation (5) that are zero on the boundary of our membrane. 

Eigenfunctions and Eigenvalues. Having taken care of (5), we turn to (4). Since 
p 2 = v 2 - k 2 in (7) and A = cv in (4), we have 


A = cVk 2 + p 2 . 


Hence to k = mu fa and p = mr/b there corresponds the value 


( 9 ) 


fm 2 n 2 

h Kiln C7r y a 2 + ^2 • 


in the ODE (4). A corresponding general solution of (4) is 


^mn(0 F mn cos A mn t 4- ^ mn sin A mn r. 


m = 1, 2, • • • , 

n = 1, 2, • • • , 


It follows that the functions ii mn ( a\ y, 0 = F mn (jt% y)G 7nn (t), written out 


( 10 ) 


with A wn according to (9), are solutions of the wave equation (1) that are zero on 
the boundary of the rectangular membrane in Fig. 299. These functions are called the 
eigenfunctions or characteristic functions, and the numbers A mn are called the 
eigenvalues or characteristic values of the vibrating membrane. The frequency of u mn is 

Discussion of Eigenfunctions. It is very interesting that, depending on a and b, several 
functions F mn may correspond to the same eigenvalue. Physically this means that there 
may exist vibrations having the same frequency but entirely different nodal lines (curves 
of points on the membrane that do not move). Let us illustrate this with the following 
example. 
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EXAMPLE 1 


Eigenvalues and Eigenfunctions of the Square Membrane 

Consider the square membrane with a = b — 1. From (9) we obtain its eigenvalues 

(11) \ mn = cttV m 2 4- n 2 . 

Hence A mn = but for m 4* n the corresponding functions 

F mn = sin ntTrx sin mry and F nm = sin uttx sin nnry 

are certainly different. For example, to A 12 = A 2 i = ctt\/5 there correspond the two functions 
F 12 = sin ttx sin l-rry and F 21 = sin 2ttx sin Try. 

Hence the corresponding solutions 

// 12 = (#12 cos crrS/St + B \ 2 sin c 7 tV 5/)F 12 and « 2 i = (#21 cos CTfV5t + £ 21 sin c 7 tV 5;)F 21 

have the nodal lines y = 5 and x = respectively (see Fig. 300). Taking B 12 - i and B* 2 = #21 = 0. we 
obtain 

(12 ) h i2 + 1/21 = cos crrV5t (F 12 + B 2 iF 21 ) 

which represents another vibration corresponding to the eigenvalue ctt\/ 5. The nodal line of this function is the 
solution of the equation 


F 12 + B 2 iF 2 i = sin ttx sin 2 iry + B 2l sin 2 t 7 .y sin Try = 0 
or, since sin 2a = 2 sin a cos a, 

(13) sin ttx sin Try (cos Try + B 2l cos ttjt) = 0. 

This solution depends on the value of Z? 21 (see Fig. 301). 

From ( 1 1 ) we see that even more than two functions may correspond to the same numerical value of A 7?m . 
For example, the four functions F 18 . F 81 . F 47 . and F 74 correspond to the value 

A18 = A 81 = A 47 = A-74 = c*7tV 65, because 1 1 + 8 2 = 4 2 + 7 2 = 65. 

This happens because 65 can be expressed as the sum of two squares of positive integers in several ways. 
According to a theorem by Gauss, this is the case for every sum of two squares among whose prime factors 
there are at least two different ones of the form 4 n + 1 where n is a positive integer. In our case we have 
65 = 5- 13 = (4 + I )(1 2 + I). ■ 


r n 

W 12 W 21 


*22 «13 W 31 

Fig. 300. Nodal lines of the solutions 
U]v t/31 in the case of 

the square membrane 



Fig. 301. Nodal lines 
of the solution (12) for 
some values of B 21 
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Step 3. Solution of the Model (1), (2), (3). 

Double Fourier Series 

So far we have solutions (10) satisfying (1) and (2) only. To obtain the solution that also 
satisfies (3), we proceed as in Sec. 12.3. We consider the double series 

oc oc 

d(x t 0 0 

m=ln=l 

(14) 

* . mm . wry 

~~ ^ ^ ( fimn cos Knnt ® rnw ^mnO 7 

m=ln=l * * 

(without discussing convergence and uniqueness). From (14) and (3a), setting / = 0, we 
have 


(15) 


u(x, y, 0) = 2 2 


fTlTTX . /27Ty 


m=ln=l 


«rn sm sin — — = /(jc, y). 


Suppose that f(x , >•) can be represented by (15). (Sufficient for this is the continuity of 
/, df/dx, df/dy, d 2 f/dxdy in R.) Then (15) is called the double Fourier series of /(a, y). 
Its coefficients can be determined as follows. Setting 


(16) 


^mOO 2) Bmn ^hl 


nrry 


n= 1 


we can write (15) in the form 


Six, y) = 2 K m {y) sin 


mm 


m= 1 


For fixed y this is the Fourier sine series of /(a, y), considered as a function of x. From 
(4) in Sec. 1 1.3 we see that the coefficients of this expansion are 


(17) 


2 r mm 

K m (y) = - /(a*, y) sin dx. 

Cl J 0 Cl 


Furthermore, (16) is the Fourier sine series of K m (y\ and from (4) in Sec. 11.3 it follows 
that the coefficients are 


2 r , rnry 

Bmn “ K m (y) sin — — dy. 

From this and (17) we obtain the generalized Euler formula 


J> r a 


( 18 ) 


4 r r mm wry m 1*2, 

Bmn = ~T /(*’ y) sin sin T dx dy 

ab J 0 J o a b n = 1 , 2 , 
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for the Fourier coefficients of f(x, y) in the double Fourier series (15). 

The B mn in (14) are now determined in terms of /(x, y). To determine the B^ ny we 
differentiate (14) termwise with respect to t; using (3b), we obtain 


da 

~dt 


n? ^ , . mm . niry 

= 2j 2j BmnKvn sin sm = g( x, y). 

t*0 m=ln=l a 0 


Suppose that g(x ; y) can be developed in this double Fourier series. Then, proceeding as 
before, we find that the coefficients are 


(19) 


D* _ 

u mn 


abKi n 


fV , x • . ntry J , 

I g( x ’ y ) sin sm — ; — dx dy 

J 0 J 0 Cl 0 


m = 1, 2, • • • 

K = 1, 2, • • • . 


Result Iff and g in (3) are such that u can be represented by (14), then (14) with 
coefficients (18) and (19) is the solution of the model (1), (2), (3). 

Vibration of a Rectangular Membrane 

Find the vibrations of a rectangular membrane of sides a - 4 ft and b = 2 ft (Fig. 302) if the tension is 
12.5 Ih/ffc, the density is 2.5 slugs/ft 2 (as for light rubber), the initial velocity is 0, and the initial displacement is 

(20) /(*, .v) = 0.1(4* - x\ly - y 2 ) Ft. 


y 


2 ■ 

L 


R 


4 


x 



4 x 


Membrane 


Initial displacement 

Fig. 302. Example 2 


Solution, c 2 = Tip = 12.5/2.5 = 5 [ft 2 /sec 2 ]. Also, B* „ = 0 from (19). From (18) and (20), 


2 4 

4 f f <y 9 mux niry 

Bmn = J o J o ai < 4A ' “ x2 )(2y - 3 ?2 ) Sin -j- sin — dx dy 

1 f 0 ni'irx f « wry 

= 20 J ~ X * Si " ~4~ dX J (2y " y ) sin 2 dy ' 


Two integrations by parts give for the first integral on the right 


and for the second integral 


128 r 256 

3 3 t* ( D ] - 33 

m tt m it 


16 r 32 

- 3-3 [1 - (-l) n ] = 


(m odd) 


(n odd). 


trir " ■ 

For even m or n we get 0. Together with the factor 1/20 we thus have B mn = 0 if m or n is even and 

256 • 32 0.426 050 


Bmn ~ 20mW ~ mV 


(m and n both odd). 
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From this, (9), and (14) we obtain the answer 


^ x-, 1 / VStt a r-z z\ rrnrx mry 

w(x, y, t) - 0.426 050 2 2 ' 3 3 cos I — - — V m 2 + 4 n 2 1 1 sin — — sin — — 

nun odd m n \ 4 * 4 2 


(21) = 0.426 050 cos 


VSWs ttx . iry I V5W 37 7r.v p 3 Try 


• t sin — sin -f- + — cos 
4 2 27 


- i sin — sin — — 


1 V5ttVT 3 3«e 


+ — cos 
27 


- 1 sin — — sin — + cos 
4 2 729 


1 V577V45 37DC 377y 

— cos 1 sin — — sin — — 

29 4 4 2 


To discuss this solution, we note that the first term is very similar to the initial shape of the membrane, has no 
nodal lines, and is by far the dominating term because the coefficients of the next terms are much smaller. The 
second term has two horizontal nodal lines (y = 2/3, 4/3), the third term two vertical ones ( x = 4/3, 8/3), the 
fourth term two horizontal and two vertical ones, and so on. H 


* M SE T 1 2 . a 


1. (Frequency) How does the frequency of the 
eigenfunctions of the rectangular membrane change if 

(a) we double the tension, (b) we take a membrane of 
half the mass of the original one, (c) we double the 
sides of the membrane? (Give reason.) 

SQUARE MEMBRANE 

2. Determine and sketch the nodal lines of the 
eigenfunctions of the square membrane for m — 1 , 2 , 
3, 4 and n— 1,2, 3, 4. 

3-8 Double Fourier Series. Represent /(x, y) by a 

series (15), where 0 < x < l,0<y< 1. 

3. f(x t y) = 1 

4. /(x, y) = x 

5* f(x , y) = y 

6 . /(x, y) = x + y 

7. f{x y y) = xy 

8 . f(x , y) = xy(l - x)(l - y) 



Fig. 303. Partial sums S 2 2 and S 10 T0 
in CAS Project 9b 


10. CAS EXPERIMENT. Quadruples of F mn . Write a 
program that gives you four numerically equal A mn in 
Example 1, so that four different F mn correspond to 
it. Sketch the nodal lines of F 18 , F siy F 47 , F 74 in 
Example 1 and similarly for further F mn that you will 
find. 


9. CAS PROJECT. Double Fourier Series, (a) Write a 
program that gives and graphs partial sums of (15). 
Apply it to Probs. 4 and 5. Do the graphs show that 
those partial sums satisfy the boundary condition (3a)? 
Explain why. Why is the convergence rapid? 

(b) Do the tasks in (a) for Prob. 3. Graph a portion, 
say, 0 <jf<i 0 <y<i of several partial sums on 
common axes, so that you can see how they differ. (See 
Fig. 303.) 

(c) Do the tasks in (b) for functions of your choice. 


11-13 Deflection. Find the deflection n(x, y, /) of the 
square membrane of side 7 rand c 2 = 1 if the initial velocity 
is 0 and the initial deflection is 

11. k sin 2x sin 5y 

12. 0.1 sin x sin y 

13. 0 . 1 xy( 7 T — x)( 7 T — y ) 

RECTANGULAR MEMBRANE 

14. Verify the discussion of the terms of (21) in Example 2. 

15. Repeat the task of Prob. 2 when a = 4 and b = 1 . 
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16. Verify the calculation of B mn in Example 2 by 
integration by parts. 

17. Find eigenvalues of the rectangular membrane of sides 
a = 2 and b = 1 to which there correspond two or 
more different (independent) eigenfunctions. 

18. (Minimum property) Show that among all rectangular 
membranes of the same area A — ab and the same c 
the square membrane is that for which u n [see (10)] 
has the lowest frequency. 

[ 1 9-22 1 Double Fourier Series. Represent f(x , y) 

(0 < x < a, 0 < y < b) by a double Fourier series (15). 

19. /( x $ y) = k 

20 . /(*, y) = 0 . 25 * 3 ’ 


21 . /(*. y) = xy(a 2 - x 2 ){b 2 - y 2 ) 

22 . f(x, y) = xy(a - x)(b - y) 

23 . (Deflection) Find the deflection of the membrane of 
sides a and b with c 2 = 1 for the initial deflection 

^ 37 tx 4 Try . . . 

f(x, y) = sin sin — — and initial velocity 0. 

a b 

24 . Repeat the task in Prob. 23 with c 2 = 1, for /(*, y) as 
in Prob. 22 and initial velocity 0. 

25 . (Forced vibrations) Show that forced vibrations of a 
membrane are modeled by the PDE u tt = c 2 V 2 w -1- P/p, 
where P(x, y, t ) is the external force per unit area acting 
perpendicular to the *y-plane. 


12.$ Laplacian in Polar Coordinates. 

Circular Membrane. 

Fourier-Bessel Series 

In boundary value problems for PDEs it is a general principle to use coordinates in which 
the formula for the boundary is as simple as possible. Since we want to discuss circular 
membranes (drumheads), we first transform the Laplacian in the wave equation (1), 
Sec. 12.8, 

(1) iht = c 2 V 2 u = c 2 ^ + Uyy) 

(subscripts denoting partial derivatives) into polar coordinates 

r = V* 2 4- y 2 , 0 = arctan . 


Hence x = r cos 0, y = r sin 0. By the chain rule (Sec. 9.6) we obtain 

u x = + u o 0 x . 

Differentiating once more with respect to x and using the product rule and then again the 
chain rule gives 

Uxx = ( u r r x)x + («A)x 

( 2 ) = (Ur) x r x + + (u 0 ) x e x + UqBxx 

( Mrr r x u r0@x)t'x u r f xx ( u 0r r x u 00^x)^x u d@xx' 

Also, by differentiation of r and 0 we find 

= £ . £ e - 1 ( 3- \ _ y 

* VJTf r’ l + (y/xf \ *) 
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y 


R x 


Fig. 304. Circular 
membrane 


Differentiating these two formulas again, we obtain 


r ~ xr x 1 * 2 y 2 /_ _2_\ 2 xy 

Ixx r 2 r r 3 f Z ’ °xx y I r 3 I r * ; .4 • 

We substitute all these expressions into (2). Assuming continuity of the first and second 
partial derivatives, we have u r0 = u er , and by simplifying. 


(3) 


_ 0 r , r , 

^2 ^ ^.3 w r0 ' ^4 ' 3 W r "r Z ^ U #. 


In a similar fashion it follows that 

„2 


(4) 


y“ xy x 2 x 2 xy 

u yy ~~ ~^2 4" ^ ^3 4* w## + “^3* u r ~~ 2 u 0 . 


By adding (3) and (4) we see that the Laplacian of u in polar coordinates is 

„ 9 d 2 u 1 du 1 d 2 u 

(5) V 2 W = 9 -i b — 2 • 

dr 2 r dr r 2 d0 2 


Circular Membrane 

Circular membranes occur in drums, pumps, microphones, telephones, and so on. This 
accounts for their great importance in engineering. Whenever a circular membrane is plane 
and its material is elastic, but offers no resistance to bending (this excludes thin metallic 
membranes!), its vibrations are modeled by the two-dimensional wave equation in polar 
coordinates obtained from (1) with V 2 n given by (5), that is, 


( 6 ) 


— -C 2 I 

( d 2 u 1 

du 1 

, i 

d 2 u 

dt 2 ' 

(dr 2 + r 

dr r 2 

d8 2 


c 2 = 


T 

P ' 


We shall consider a membrane of radius R (Fig. 304) and determine solutions u(r, t ) 
that are radially symmetric. (Solutions also depending on the angle 8 will be discussed in 
the problem set.) Then u 00 = 0 in (6) and the model of the problem (the analog of (1), 
(2), (3) in Sec. 12.8) is 


(7) 

C ) 2 u 2 j 

(d 2 u l 

du 


1 >- 
+ 

Icsi 

dr 

(8) 

u(R, t) = 

0 for all t ^ 0 

(9a) 

«(/-, 

' ✓ 

II 

o 


(9b) 

u t (r, 

0) = g(r). 



Here (8) means that the membrane is fixed along the boundary circle r = R. The initial 
deflection /(/•) and the initial velocity g{r) depend only on r, not on 8, so that we can 
expect radially symmetric solutions n(r, t). 
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Step 1. Two ODEs From the Wave Equation (7). 

Bessel’s Equation 

Using the method of separation of variables, we first determine solutions u(r , t) = W(r) G(/). 
(We write W , not F because W depends on r, whereas F, used before, depended on x.) 
Substituting u = WG and its derivatives into (7) and dividing the result by c 2 WG, we get 



where dots denote derivatives with respect to t and primes denote derivatives with respect 
to r. The expressions on both sides must equal a constant. This constant must be negative, 
say, — k 2 > in order to obtain solutions that satisfy the boundary condition without being 
identically zero. Thus, 


c 2 G 


- -57 («" + 7 w*) = -* a 


This gives the two linear ODEs 

( 10 ) 

and 


G + A 2 G = 0 


where A = ck 


( 11 ) 


W" + - W' +k z W= 0. 

r 


We can reduce (1 1) to Bessel’s equation (Sec. 5.5) if we set s = hr. Then 1/r = k/s and, 
retaining the notation W for simplicity, we obtain by the chain rule 


w , _ dW dW ds dW 

dr ds dr ds 


and W = 


d 2 W 


ds 2 




By substituting this into (11) and omitting the common factor k 2 we have 

( 12 ) 


d 2 W 1 dW 

+ — + W = 0 . 


ds 2 s ds 


This is Bessel’s equation (1), Sec. 5.5, with parameter v — 0. 


Step 2. Satisfying the Boundary Condition (8) 

Solutions of (12) are the Bessel functions J 0 and Y 0 of the first and second kind (see 
Secs. 5.5, 5.6). But Y 0 becomes infinite at 0, so that we cannot use it because the deflection 
of the membrane must always remain finite. This leaves us with 


(13) 


W(r) = J 0 (s ) = J 0 (kr) 


(s = kr). 
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On the boundary r = R we get W(R) = J 0 (kR) = 0 from (8) (because G = 0 would imply 
u = 0). We can satisfy this condition because J 0 has (infinitely many) positive zeros, 
s = a ly a 2 , * • • (see Fig. 305), with numerical values 

ai = 2.4048, a 2 = 5.5201, a 3 = 8.6537, or 4 = 11.7915, cr 5 = 14.9309 

and so on. (For further values, consult your CAS or Ref. [GR1] in App. I.) These zeros 
are slightly irregularly spaced, as we see. Equation (13) now implies 

(14) kR = at,,, thus k — kfn = — ~ , m — 1, 2, • • • . 

A 

Hence the functions 

(15) W m {r) = J 0 (k m r) = J 0 ^ rj , m = 1, 2, • • • 

are solutions of (1 1) that are zero on the boundary circle r = R . 

Eigenfunctions and Eigenvalues. For W vl in (15), a corresponding general solution of 
(10) with A = A m . = ck m = ca^JR is 

G m (t) = A m cos A m t + B m sin Kj. 

Hence the functions 

(16) u m (i\ t) = W m (r)G m (t) = (A m cos \, n t + B m sin A m f)/ 0 (*»»»‘) 

with m = 1 , 2, • ■ • are solutions of the wave equation (7) satisfying the boundary condition 
(8). These are the eigenfunctions of our problem. The corresponding eigenvalues are A m . 

The vibration of the membrane corresponding to u m is called the mth normal mode; 
it has the frequency A in l2ir cycles per unit time. Since the zeros of the Bessel function J 0 
are not regularly spaced on the axis (in contrast to the zeros of the sine functions appearing 
in the case of the vibrating string), the sound of a drum is entirely different from that of 
a violin. The forms of the normal modes can easily be obtained from Fig. 305 and are 
shown in Fig. 306. For m = 1 , all the points of the membrane move up (or down) at the 
same time. For m = 2, the situation is as follows. The function W 2 (r) = J 0 ( a 2 r!R ) is zero 
for a 2 r/R = a l9 thus r = a r R/a 2 . The circle r = is, therefore, nodal line, and 

when at some instant the central part of the membrane moves up, the outer part 
(/• > c*i Rla 2 ) moves down, and conversely. The solution w m (r, t) has m — 1 nodal lines, 
which are circles (Fig. 306). 




J 0 (s) 





-10 . 
^ 1 xT 

~s? 

L- 

/ 

/ 

\ 

5 

1 

10 





V- 

^“2 

S “4 

s 


Fig. 305. Bessel function J 0 (s) 
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m - 1 


m - 2 


m = 3 


Fig. 306. Normal modes of the circular membrane in the case of vibrations 
independent of the angle 


Step 3. Solution of the Entire Problem 

To obtain a solution u(r, t) that also satisfies the initial conditions (9), we may proceed 
as in the case of the string. That is, we consider the series 


(17) u(r, 0-2 W«(r)G m (0 = 2 (Am cos A m t + B m sin A m /) J 0 ( r) 

w=l m=l ' ' 

(leaving aside the problems of convergence and uniqueness). Setting t = 0 and using (9a), 
we obtain 


(18) u(r, 0) = 2 AmjJ^T r) = fid- 

Thus for the series (17) to satisfy the condition (9a), the constants A m must be the 
coefficients of the Fourier-Bessel series (18) that represents /(/*) in terms of J 0 (a m r/R); 
that is [see (10) in Sec. 5.8 with n = 0, ar 0jm = and a* = r], 

(19) = 

Differentiability of /(/*) in the interval 0 ^ r ^ R is sufficient for the existence of the 
development (18); see Ref. [A13]. The coefficients B m in (17) can be determined from 
(9b) in a similar fashion. Numeric values of A m and B m may be obtained from a CAS or 
by a numeric integration method, using tables of J 0 and J v However, numeric integration 
can sometimes be avoided \ as the following example shows. 
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EXAMPLE 1 


Vibrations of a Circular Membrane 

Find die vibrations of a circular drumhead of radius 1 l*t and density 2 slugs/ft 2 if the tension is 8 lb/ft, the initial 
velocity is 0, and the initial displacement is 

f(r) = 1 - r 2 [ft]. 

Solution, c 2 = Tip =8/2 = 4 [ft 2 /$ec 2 ]. Also B m = 0, since the initial velocity is 0. From ( 1 9) and Example 
3 in Sec. 5.8, since R = 1, we obtain 

2 f 1 2 

Ki = —£ — T r(l - r ) ./o(a, tt r) dr 

J 1 (* 711 ) •'o 
= 4J 2 (a,, t ) 

«m 2 A 2 («m) 

_ 8 
“m 3 A(“m) 

where the last equality follows from (24c). Sec. 5.5, with v = I, that is, 

2 2 

= A(<W* 

Table 9.5 on p. 409 of [GR I ] gives OL, n and Jo(c^n). From this we get .^(cqj = — b y (24b), Sec. 5.5, 
widi v — 0, and compute the coefficients A m : 


m 


j l(ow) 



1 

2.40483 

0.51915 

0.43176 

1.10801 

2 

5.52008 

-0.34026 

-0.12328 

-0.13978 

3 

8.65373 

0.27145 

0.06274 

0.04548 

4 

11.79153 

-0.23246 

-0.03943 

-0.02099 

5 

14.93092 

0.20655 

0.02767 

0.01164 

6 

18.07106 

-0.18773 

-0.02078 

-0.00722 

7 

21.21164 

0.17327 

0.01634 

0.00484 

8 

24.35247 

-0.16170 

-0.01328 

-0.00343 

9 

27.49348 

0.15218 

0.01107 

0.00253 

10 

30.63461 

-0.14417 

-0.00941 

-0.00193 


Thus 

/(r) = 1.108y o (2.4048r) - 0.140/ o (5.52Olr) + 0.0457 o (8.6537r) . 

We see that die coefficients decrease relatively slowly. The sum of the explicitly given coefficients in the table 
is 0.99915. The sum of all the coefficients should be 1. (Why?) Hence by die Leibniz test in App. A3.3 the 
partial sum of those terms gives about three correct decimals of the amplitude f(r). 

Since 

A m = ck m — ca m !R = 2cq n , 


from (17) we thus obtain the solution (with r measured in feet and r in seconds) 

u(r, t ) = l.l087 o (2.4048r) cos 4.8097/- 0.1 407 0 (5. 5201 r) cos 11.0402r + 0.045 7 0 (8.6537r) cos 17.3075/ . 

In Fig. 306, m = l gives an idea of the motion of the first term of our series, m = 2 of the second term, and 
m — 3 of the third term, so that we can “see” our result about as well as for a violin string in Sec. 12.3. H 
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! PROBLEM SET 12 


1. Why did we use polar coordinates in this section? 

2. Work out the details of the calculation leading to the 
Laplacian in polar coordinates. 

3. if u is independent of 0, then (5) reduces to 
V 2 // = u rr + ityfr. Derive this directly from the 
Laplacian in Cartesian coordinates. 

* 1 fl / du \ 

4. An alternative form of (5) is \ u = — r — 

l d 2 u ' ' di ' 

H — 2 ~*~ a 2 * Derive this from ( 5 ). 

r a 9 

5. (Radial solution) Show that the only solution of 
V 2 « = 0 depending only on r = V ? + 7 is 

it = a In r + b with constant a and b. 

6. TEAM PROJECT. Series for Dirichlet and 
Neumann Problems 

(a) Show that u n = r n cos nft u n = r n sin «ft n = 0, 
1. • • • , are solutions of Laplace’s equation V 2 // = 0 
with V 2 w given by (5). (What would u u be in Cartesian 
coordinates? Experiment with small /?.) 

(b) Dirichlet problem (See Sec. 12.5) Assuming that 
termwise differentiation is permissible, show that a 
solution of the Laplace equation in the disk r < R 
satisfying the boundary condition //(/?, 6) = f(9) 
(/ given) is 


( 20 ) 


«(/*. 0) — a 0 + 


+ b 


n 



cos n 6 


with arbitrary A 0 and 

1 r 

A n = vnR n-l J C0S n0 d8 ’ 

Bn = j_m sin node. 


(e) Compatibility condition Show that (9), Sec. 10.4, 
imposes on f(0) in (d) the “compatibility condition" 

J m de = o. 


(f) Neumann problem Solve V 2 // = 0 in the annulus 
1 < /* < 3 if « r (L 0) = sin ft u r ( 3, 0) = 0. 


7-12 


ELECTROSTATIC POTENTIAL. 
STEADY-STATE HEAT PROBLEMS 


The electrostatic potential satisfies Laplace’s equation 
V 2 // = 0 in any region free of charges. Also the heat 
equation u t = c 2 V 2 w (Sec. 12.5) reduces to Laplace’s equation 
if the temperature u is time-independent (“steady-state 
case”). Using (20), find the potential (equivalently: the 
steady-state temperature) in the disk r < 1 if the boundary 
values are (sketch them, to see what is going on). 

7. i/(l, 6) = 40 cos 3 6 

8. i/(l, 9) = 800 sin 3 9 

9. u( 1, 9) = 110 if -\tt < 9 < |7rand 0 otherwise 

10. i/(l, 9) = 9 if -\tt < 9 < §7r and 0 otherwise 

11. //(I, 9) = |0| if —7 T < 9 < 7T 

12. 11(1, 9) = 0 2 if -77 < 9 < 7T 


where a n , b n are the Fourier coefficients of / (see 
Sec. 11.1). 

(c) Dirichlet problem Solve the Dirichlet problem 
using (20) if R = 1 and the boundary values are 
u(9) = —100 volts if — 7T < 0 < 0, //(0) = 100 volts 
if 0 < 0 < 7r. (Sketch this disk, indicate the boundary 
values.) 

(d) Neumann problem Show that the solution of the 
Neumann problem V 2 w = 0 if r < R, u N (R , 0) = /(0) 
(where u N = du/dN is the directional derivative in the 
direction of the outer normal) is 

sc 

u(r , 0) = A„ + 2 r n (A n cos nO + B n sin n0) 

n=*l 


13. CAS EXPERIMENT. Equipotential Lines. Guess 
what the equipotential lines //(/*, 0) = const in Probs. 
9 and 1 1 may look like. Then graph some of them, 
using partial sums of the series. 

14. (Semidisk) Find the electrostatic potential in the 
semidisk r < 1 , 0 < 0 < ir which equals 1 1O0(7T — 0) 
on the semicircle r = 1 and 0 on the segment 
- 1 < .v < 1 . 

15. (Semidisk) Find the steady-state temperature in a 
semicircular thin plate r < a, 0 < 9 < ir with the 
semicircle r = a kept at constant temperature u 0 and 
the segment — a < x < a at 0. 

16. (Invariance) Show that V 2 // is invariant under 
translations a * = .v + n, y* = y + b and under rotations 
x* = x cos a — y sin a, y* = a sin a + y cos a. 
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17. (Frequency) What happens to the frequency of an 
eigenfunction of a drum if you double the tension? 

18. (Size of a drum) A small drum should have a higher 
fundamental frequency than a large one, tension and 
density being the same. How does this follow from our 
formulas? 

19. (Tension) Find a formula for the tension required to 
produce a desired fundamental frequency f x of a 
drum. 


(23) G + A 2 G = 0, where A = ck , 

(24) F rr + y F r + -pr F 00 + k 2 F = 0. 

Show that the PDE can now be separated by 
substituting F = W(r)Q(0), giving 

(25) Q" + n 2 Q = 0, 

(26) r 2 W" 4- rW' + (it 2 /- 2 - n 2 )W = 0. 


20. CAS PROJECT. Normal Modes, (a) Graph the 
normal modes w 4 . it 5 , i/ 6 as in Fig. 306. 

(b) Write a program for calculating the A m 9 s in 
Example 1 and extend the table to m = 15. Verify 
numerically that a m (/?? — ^)tt and compute the 
error for m = 1, • • • , 10. 

(c) Graph the initial deflection f(r) in Example 1 as 
well as the first three partial sums of the series. 
Comment on accuracy. 

(d) Compute the radii of the nodal lines of w 3 > w 4 
when R = 1. How do these values compare to those of 
the nodes of the vibrating string of length 1 ? Can you 
establish any empirical laws by experimentation with 
further w m ? 

21. (Nodal lines) Is it possible that for fixed c and R two 
or more u m [see (16)] with different nodal lines 
correspond to the same eigenvalue? (Give a reason.) 

22. Why is Ay + A 2 + * * * = 1 in Example 1? Compute 
the first few partial sums until you get 3-digit accuracy. 
What does this problem mean in the field of music? 

23. (Nonzero initial velocity) Show that for (17) to satisfy 
(9b) we must have 


( 21 ) 


2 

ca m RJ ! 2 {a m ) 


X 



25. (Periodicity) Show that Q($) must be periodic with 
period 2 tt and, therefore, n = 0, 1, 2, • * • in (25) and 
(26). Show that this yields the solutions Q n — cos nO , 
Q n * = sin nB , W n = / n (£/‘)> n = 0, 1, • * * . 

26. (Boundary condition) Show that the boundary 
condition 


(27) 


«(/?. 0 , /) = 0 


leads to k = k mn = a mn !R, where s = ot mn is the mth 
positive zero of J n (s). 

27. (Solutions depending on both r and 0) Show that 
solutions of (22) satisfying (27) are (see Fig. 307) 


(28) 


u mn = (A mn cos ck mn t H* sin ck mn t') X 
x Jn(Kn n r) cos n$ 

l( inn = cos ck inn t + sin ck mn t) X 
X / n (W) sin nO 





^11 12 23 

Fig. 307. Nodal lines of some of the solutions (28) 


VIBRATIONS OF A CIRCULAR MEMBRANE 
DEPENDING ON BOTH r AND 0 

24. (Separations) Show that substitution of it = F(r, 0)G(t) 
into the wave equation (6), that is, 

(22) u tt = c 2 Ur + “ooj 


28. (Initial condition) Show that u t (r, 0 , 0) = 0 gives 
B mn = 0, B* n = 0 in (28). 

29. Show that u% 0 = 0 and u 7n0 is identical with (16) in 
the current section. 

30. (Semicircular membrane) Show that u n represents 
the fundamental mode of a semicircular membrane and 
find the corresponding frequency when c 2 = 1 and 
R= 1. 


gives an ODE and a PDE 
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12.1 Laplace’s Equation in Cylindrical and 
Spherical Coordinates. Potential 

Laplace’s equation 

(1) V 2 « = Uxx + Uyy + U ZZ = 0 


is one of the most important PDEs in physics and its engineering applications. Here, 
jc, y, z are Cartesian coordinates in space (Fig. 165 in Sec. 9.1), u ^ = d 2 u/dx 2 , etc. The 
expression V 2 « is called the Laplacian of u. The theory of the solutions of (1) is called 
potential theory. Solutions of (1) that have continuous second partial derivatives are 
known as harmonic functions. 

Laplace’s equation occurs mainly in gravitation, electrostatics (see Theorem 3, 
Sec. 9.7). steady-state heat flow (Sec. 12.5), and fluid flow (to be discussed in 
Chap. 18.4). 

Recall from Sec. 9.7 that the gravitational potential u(x , y, z ) at a point (a*, y, z) resulting 
from a single mass located at a point (X, Y, Z) is 


( 2 ) 


u( x, y, z) 


c 


r 


c 

Vc X - Xf + (y - Yf + (Z - Zf 


(r > 0) 


and u satisfies (1). Similarly, if mass is distributed in a region T in space with density 
p(X, Y, Z), its potential at a point (,v, y, z) not occupied by mass is 


(3) u(x, y, z) = k fff — - — dX dY dZ. 

T 1 

It satisfies (1) because V 2 (l/r) = 0 (Sec. 9.7) and p is not a function of A', y, z- 
Practical problems involving Laplace’s equation are boundary value problems in a 
region T in space with boundary surface S. Such a problem is called (see also Sec. 12.5 
for the two-dimensional case): 

(I) First boundary value problem or Dirichlet problem if u is prescribed on 5. 
(II) Second boundary value problem or Neumann problem if the normal 
derivative u n = du/dn is prescribed on S. 

(HI) Third or mixed boundary value problem or Robin problem if u is prescribed 
on a portion of S and u n on the remaining portion of 5. 


Laplacian in Cylindrical Coordinates 

The first step in solving a boundary value problem is generally the introduction of 
coordinates in which the boundary surface 5 has a simple representation. Cylindrical 
symmetry (a cylinder as a region 7) calls for cylindrical coordinates r, 6, z related to x, 
y,z by 


(4) 


x = r cos 0, 


y = r sin 0, 


z = z 


(Fig. 308, p. 588). 
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Fig. 308. Cylindrical coordinates Fig. 309. Spherical coordinates 

For these we get V 2 w immediately by adding u ^ to (5) in Sec. 12.9; thus, 

( 5 ) 


- d 2 U 1 du 1 d 2 U d 2 ll 

V 2 Lt = — « + + -y + 7? • 

dr 2 r dr r 2 d6 2 c)z 2 


Laplacian in Spherical Coordinates 

Spherical symmetry (a ball as region T bounded by a sphere S) requires spherical 
coordinates r, 6 , <t> related to x, y, z by 

(6) x = r cos 6 sin <f>, y = r sin 6 sin </>, z = r cos <f> (Fig. 309). 
Using the chain rule (as in Sec. 12.9), we obtain V 2 u in spherical coordinates 

1 


(7) V 2 « = 


d 2 u 


2 du ( 1 d 2 u t COt <f> du 

+ 7* + ~ + 


d 2 u 


dr* r dr r * d<j>* r * d^> r 2 sin 2 <f> d0 2 

We leave the details as an exercise. It is sometimes practical to write (7) in the form 
1 f d ( 9 du\ 1 d I 1 d 2 u ~| 

ai v " - l r vj + a h) + itn. ■ 

Remark on Notation. Equation (6) is used in calculus and extends the familiar notation 
for polar coordinates. Unfortunately, some books use 6 and <f> interchanged, an extension 
of the notation x = r cos <j>, y = r sin <£ for polar coordinates (used in some European 
countries). 

Boundary Value Problem in Spherical Coordinates 

We shall solve the following Dirichlet problem in spherical coordinates: 


( 8 ) 

( 9 ) 

( 10 ) 


_o 1 f C> / 2 du\ 1 d ( . du \1 n 
V 2 h = -» — r 2 — + — — — sin <£ — = 0. 

r 2 L \ sm <f> d<f> \ 


«(*, <f>) = m 

lim w(r, <£) = 0. 

r— * oo 
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The PDE (8) follows from (7) by assuming that the solution w will not depend on 0 because 
the Dirichlet condition (9) is independent of 0. This may be an electrostatic potential (or 
a temperature) /(</>) at which the sphere S: r = R is kept. Condition (10) means that the 
potential at infinity will be zero. 

Separating Variables by substituting u(i\ <f>) = G(r)H(<f>) into (8). Multiplying (8) by 
r 2 , making the substitution and then dividing by GH , we obtain 

l) 

G dr \ dr j H sin <f> deb \ v d<f> ) 

By the usual argument both sides must be equal to a constant k. Thus we get the two 
ODEs 



The solutions of (11) will take a simple form if we set A: = n(n + 1). Then, writing 
G' = dG/dr, etc., we obtain 

(13) r 2 G" + 2 rG' ~ n(n + 1 )G = 0. 

This is an Euler-Cauchy equation. From Sec. 2.5 we know that it has solutions G — r a . 
Substituting this and dropping the common factor r a gives 

a(a — l) + 2 a — n(n + 1) = 0. The roots are a = n and —n — 1. 

Hence solutions are 

(14) G n (r) = r" and G*(r) = . 

We now solve (12). Setting cos <f> = w, we have sin 2 <f> = 1 — w 2 and 

d _ d dw _ . d 

~d$ ~ ~dw ~d^> ~ _Sm * ~dw ' 

Consequently, (12) with k = n(n + 1) takes the form 


( 15 ) 


d ,, 2 , dH 

— (1 - w 2 ) — + n{n + l)H = 0. 

dw L dw J 
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This is Legendre’s equation (see Sec. 5.3), written out 

, , d z H clH 

(15') (1 - iv 2 ) —=■ - 2 w — + n(n + 1 )H = 0. 

dw * dw 

For integer n = 0, 1, • • • the Legendre polynomials 

H = P n (w) = F n (cos 0) n = 0, 1, • • • , 

are solutions of Legendre’s equation (15). We thus obtain the following two sequences 
of solution u = GH of Laplace’s equation (8), with constant A n and B n , where 
n = 0, 1, • ■ * , 

(16) (a) u n (t\ <j>) = A n r n P n ( cos <j&), (b) u*(r, (/>) = P n ( cos </>) 

r 


Use of Fourier-Legendre Series 

Interior Problem: Potential Within the Sphere S. We consider a series of terms from 
(16a), 

OC 

( 17 ) u(r, <t>) = '2 A n r n P n (cos <f>) (;• S R). 

0 


Since S is given by r = /?, for (17) to satisfy the Dirichlet condition (9) on the sphere S, 
we must have 


(18) u(R, 0) = 2 A n R n P n (cos <f>) = /(</»); 

n—0 

that is, (18) must be the Fourier-Legendre series of /(<£). From (7) in Sec. 5.8 we get 
the coefficients 


2n + 1 r 1 ^ 

(19*) A n R n = — — I f(w)P n (w) dw 

i J -i 

where f(w) denotes /(<£) as a function of w = cos <f > . Since dw = — sin cf> d<f > , and the 
limits of integration — 1 and 1 correspond to <f) = tt and <f> = 0 , respectively, we also 
obtain 


2 n 

( 19 ) A n = — f(<f>)P n (cos 4>) sin <M<k n= 0, 1, • • • . 

If f{4>) and are piecewise continuous on the interval 0 =§ </> tt, then the series (17) 
with coefficients (19) solves our problem for points inside the sphere because it can be 
shown that under these continuity assumptions the series (17) with coefficients (19) gives 
the derivatives occulting in (8) by termwise differentiation, thus justifying our derivation. 
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EXAMPLE 1 


Exterior Problem: Potential Outside the Sphere S . Outside the sphere we cannot use 
the functions u n in (16a) because they do not satisfy (10). But we can use the m* m (16b), 
which do satisfy (10) (but could not be used inside S ; why?). Proceeding as before leads 
to the solution of the exterior problem 

(20) u(r, = S -§r P n(cos 4) (r § R) 

n = 0 r 

satisfying (8), (9), (10), with coefficients 

2 fi “ 1 “ 1 r 7 * 

(21) B n = — — R n+1 I mp n ( cos 4) sin 4 d<f>. 

* J o 


The next example illustrates all this for a sphere of radius 1 consisting of two hemispheres 
that are separated by a small strip of insulating material along the equator, so that these 
hemispheres can be kept at different potentials (1 10 V and 0 V). 


Spherical Capacitor 

Find the potential inside and outside a spherical capacitor consisting of two metallic hemispheres of radius 1 ft 
separated by a small slit for reasons of insulation, if the upper hemisphere is kept at 1 10 V and the lower is 
grounded (Fig. 310). 

Solution . The given boundary condition is (recall Fig. 309) 


M) = 


fno 
1 0 


if 0 ^ <f) < 7t/ 2 
if 7 t/2 < (}) ~ 7 r. 


Since R = 1, we thus obtain from (19) 

In + 1 f 

A n = — - — • 1 10 I ^(cos <j)) sin (J> d<l> 
2 


,.77/2 


2 n + 


- • 1 10 f P n I 


(w) dw 


where w — cos 0. Hence P n ( cos 4>) sin <j> d<j> = —P n (w) dw, we integrate from I to 0, and we finally get rid 
of the minus by integrating from 0 to 1. You can evaluate this integral by your CAS or continue by using (II) 
in Sec. 5.3, obtaining 


M 

A n = 55 (2n + 1) X (-1)"* 


m=0 


(2n - 

2 n m!(n - m)!(n - 2m)! 


f w n ~ 2m dw 
J o 


where M = n!2 for even n and M =■ (n - l)/2 for odd n. The integral equals 1/(w — 2m -F I). Thus 


2 



Fig. 310. Spherical capacitor in Example 1 
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EXAMPLE 2 


( 22 ) 


55(2m + 1) “ 

*» = In 2 (-1)” 

^ m*=0 


(2/? - 2/m)! 


m!(iz — m)l(n - 2m +1)! 
Taking n = 0, we get A 0 = 55 (since 0! = 1). For n = 1, 2, 3,* ♦ • we get 


A, = 


165 

2! 

165 

2 

0!1!2! 

2 ’ 

275 | 

f 4! 

2! 

4 ' 

[ 01213! 

Hill! 

385 i 

f 6! . 

4! 

8 ' 

^ 0!3!4! 

1 !2!2! 


etc. 


Hence the potential (17) inside the sphere is (since P 0 = 1) 
(23) 


165 385 o 

w(r, <f>) = 55 + —t~ r P x ( cos <f>) — - rP 3 (cos <f>) + 

2 o 


(Fig. 311) 


with Piy P 3 , ’ - • given by (l 1 / ), Sec. 5.3, Since R = 1, we see from (19) and (21) in this section that 
B n = A U y and (20) thus gives the potential outside the sphere 


(24) 


55 165 385 

//(/•, <f>) = — + —2 Pit cos <f>) - — P 3 (cos <£) + **•. 
/* 2r 8r 


Partial sums of these series can now be used for computing approximate values of the inner and outer potential. 
Also, it is interesting to see that far away from the sphere the potential is approximately that of a point charge, 
namely, 55/r. (Compare with Theorem 3 in Sec. 9.7.) U 



Fig. 311. Partial sums of the first 4, 6, and 11 
nonzero terms of (23) for r = R = 1 


Simpler Cases. Help with Problems 

The technicalities occurring in cases like that of Example 1 can often be avoided. For instance, find the potential 
inside the sphere S: r = R = 1 when S is kept at the potential f(<f>) = cos 2 <f>. (Can you see the potential on S? 
What is it at the North Pole? The equator? The South Pole?) 

Solution . w = cos <t?y cos 2 <f> = 2 cos 2 <f> — 1 = 2 w 2 — 1 = §P 2 (w) — 3 = 3 ( 2 ^ ~ 2 ) ” 3 - Hence the 
potential in the interior of the sphere is 


« = \r 2 P 2 (w) - § = §r 2 P 2 (cos <£) - | = §r 2 (3 cos 2 <t> - 1) - $. 
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iH^aHEEBESEBEEHEEQE 


1. Derive (7) from V 2 u in Cartesian coordinates. (Show 
the details.) 

2. Find the surfaces on which the functions u lt u 2 , w 3 are 
zero. 

3. Sketch the functions P„(cos <j>) for n = 0, 1 , 2 (see 
(H f ) in Sec. 5.3). 

4. Sketch the functions P 3 (cos $) and P 4 ( cos </>), 

5. Verify that u n and u n * In (16) are solutions of (8). 


6-1 1 1 POTENTIALS DEPENDING ONLY ON r 

6. (Dimension 3) Show that the only solution of the 
Laplace equation depending only on 

r = V.v 2 4 y 2 4 z 2 is u = dr 4 k with constant c 
and k. 


7. (Dimension 3) Verity that a = dr, 

r = V.v 2 -f y 2 4 z 2 , satisfies Laplace’s equation in 
spherical coordinates. 


8. (Dirichlet problem). Find the electrostatic potential 
between two concentric spheres of radii rj = 1 0 cm 
and r 2 = 20 cm kept at potentials Ui = 260 V and 
U 2 — MOV, respectively. 


9. (Dimension 2, logarithmic potential) Show that the 
only solution of the two-di mensiona l Laplace equation 
depending only on r = Vy 2 4 y 2 is u = c In r 4 k 
with constant c and k. 


10. (Logarithmic potential) Find the electrostatic potential 
between two coaxial cylinders of radii r x = 10 cm and 
r 2 = 20 cm kept at potentials U x = 260 V and 
U 2 = 1 10 V, respectively. Compare with Prob. 8. 
Comment. 


11. (Heat problem) If the surface of the ball 

r 2 = .v 2 4 v 2 4- z 2 = R 2 is kept at temperature 
zero and the initial temperature in the ball is /(/*), 
show that the temperature u(/\ /) in the ball is a solution 
of u t = c 2 (u rr + 2u r /r) satisfying the conditions 
u(R, t) = 0, //(/% 0) = /(r). Show that setting v = ru 
gives v t = c 2 v rr , v(R, t ) = 0, v(i\ 0) = r/(r). Include 
the condition u(0. 0 = 0 (which holds because u must 
be bounded at r = 0), and solve the resulting problem 
by separating variables. 


12. (Two-dimensional potential problems) Show that the 
functions .v 2 - v 2 , .vy, x/(x 2 4 y 2 ). e x cos y. e x sin y, 
cos .v coshy. In (.v 2 4 y 2 ), and arctan (y/.v) satisfy 
Laplace’s equation u xx 4 u yy = 0. (Two-dimensional 
potential problems are best solved by complex 
analysis , as we shall see in Chap. 18.) 


1 13—17 1 BOUNDARY VALUE PROBLEMS IN 
SPHERICAL COORDINATES r , 0, <!> 

Find the potential in the interior of the sphere S: r = R = 1 

if this interior is free of charges and the potential on S is: 

13. M) = 100 

14. /(</>) = cos (f> 

15. = cos 3<j) 

16. /(<£) = sin 2 <f) 

17. f(4>) = 35 cos4<^> 4 20 cos 2 <f> 4 9 

18. Show that in Prob. 13 the potential exterior to the 
sphere is the same as that of a point charge at the origin. 
Is this physically plausible? 

19. Sketch the intersection of the equipotential surfaces in 
Prob. 14 with the vz-plane. 

20. Find the potential exterior to the sphere in Example 2 
of the text and in Prob. 15. 

21. What is the temperature in a ball of radius 1 and of 
homogeneous material if its lower boundary 
hemisphere is kept at 0°C and its upper at I00°C? 

22. (Reflection In a sphere) Let r. 6 . be spherical 

coordinates. If //(r, 6 , <f>) satisfies V 2 */ = 0, show that 
u(r , 0, <f>) = m( 1//\ 0, <f>)/r satisfies V 2 u = 0. What 
does this give for (16)? 

23. (Reflection in a circle) Let r, 0 be polar coordinates. 
If w(/\ 0) satisfies V 2 h = 0, show that the function 
i>(/\ 0) = w(l/r, 0) satisfies V 2 y = 0. What are 
u = r cos 0 and v in terms of x and y? Answer the 
same question for u = r 2 cos 0 sin 0 and u. 

24. TEAM PROJECT. Transmission Line and Related 
PDEs. Consider a long cable or telephone wire 
(Fig. 312) that is imperfectly insulated, so that leaks 
occur along the entire length of the cable. The source 
S of the current /(.v, t) in the cable is at x = 0, the 
receiving end T at x = /. The current flows from S to 
7, through the load, and returns to the ground. Let the 
constants R , L, C, and G denote the resistance, 
inductance, capacitance to ground, and conductance to 
ground, respectively, of the cable per unit length. 


Load 

x=Q x=l 

Fig. 312. Transmission line 
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(a) Show that (“first transmission line equation") 


dn_ 

dx 


di 

Ri + L — 

clt 


where u(x, t) is the potential in the cable. Hint: Apply 
Kirchhoff’s voltage law to a small portion of the cable 
between x and jc 4- Ax (difference of die potentials at 
x and x + Ax - resistive drop 4* inductive drop). 

(b) Show that for the cable in (a) (“second 
transmission line equation"). 


di 

dx 


du 

= Gu + C — 
di 


Hint : Use Kirchhoffs current law (difference of the 
currents at x and * 4- Ax = loss due to leakage to 
ground + capacitive loss). 

(c) Second-order PDEs. Show that elimination of 
/ or u from the transmission line equations leads to 


u xx = LCun 4- (RC 4- GL)u t 4- RGu , 
i xx = LCi tt + (/?C + GL)i t 4- RGi. 


(d) Telegraph equations. For a submarine cable, 
G is negligible and the frequencies are low. Show that 
this leads to the so-called submarine cable equations 

or telegraph equations 

u xx = RCu t , i xx = RCi t . 

Find the potential in a submarine cable with ends 
( x = 0, x = /) grounded and initial voltage distribution 
U 0 = const. 

(e) High-frequency line equations. Show that in the 
case of alternating currents of high frequencies the 
equations in (c) can be approximated by the so-called 
high-frequency line equations 

m xx ~ LCun , i X x LCitt. 

Solve the first of them, assuming that the initial 
potential is 

U 0 sin (ttx//), 

and w t (x, 0) = 0 and u = 0 at the ends x = 0 and 
x = / for all t. 


12.11 Solution of PDEs by Laplace Transforms 

Readers familiar with Chap. 6 may wonder whether Laplace transforms can also be used 
for solving partial differential equations. The answer is yes, particularly if one of the 
independent variables ranges over the positive axis. The steps to obtain a solution are 
similar to those in Chap. 6. For a PDE in two variables they are as follows. 

1. Take the Laplace transform with respect to one of the two variables, usually r. This 
gives an ODE for the transform of the unknown function. This is so since the 
derivatives of this function with respect to the other variable slip into the transformed 
equation. The latter also incorporates the given boundary and initial conditions. 

2. Solving that ODE, obtain the transform of the unknown function. 

3. Taking the inverse transform, obtain the solution of the given problem. 

If the coefficients of the given equation do not depend on f, the use of Laplace transforms 
will simplify the problem. 

We explain the method in terms of a typical example. 


EXAMPLE 1 Semi-Infinite String 


Find the displacement vv(x. /) of an elastic string subject to the following conditions. (We write w since we need 
u to denote the unit step function.) 

(i) The string is initially at rest on the x-axis from x = 0 to oo (“semi-infinite string ”). 


(ii) For i > 0 the left end of the string (x = 0) is moved in a given fashion, namely, according to a single 
sine wave 

{ sin / if 0 ^ ^ 2tt 

0 


otherwise 


(Fig. 313). 
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Fig. 313. Motion of the left end of the string in Example 1 as a function of time f 


(iii) Furthermore, lim u(jc, /) = 0 for t ^ 0. 

.r-*oc 

Of course there is no infinite siring, but our model describes a long string or rope (of negligible weight) with 
its right end fixed far out on the .v-axis. 

Solution . We have to solve the wave equation (Sec. 12.2) 

(1) 


a 2 w _ 2 aV 


„a_ I 


(/SO) 


ar 

for positive x and /, subject to the “boundary conditions” 

(2) u'(0, /) = /(/), lim w(x. /) = 0 

X— 

with / as given above, and the initial conditions 

(3) (a) iv(.v, 0) = 0, (b) w t (x y 0) = 0. 

We take the Laplace transform with respect to t. By (2) in Sec. 6.2, 

% = j.- 2 £C{h') - j»v(.y. 0) - iv e (.v, 0) = c 2 X |^ 2 "J • 

The expression —s\\>(x. 0) — vv t (.v, 0) drops out because of (3). On the right we assume that we may interchange 
integration and differentiation. Then 

f d z w ] d 2 w d 2 f 06 r? 2 


Writing W(x, s ) = i£(w(.v, 0), we thus obtain 

d 2 W 


s 2 W = c 2 


Ox 2 


thus 


a** 


— t w = o. 


Since tills equation contains only a derivative with respect to x, it may be regarded as an ordinary differential 
equation for W{ x t s) considered as a function of x. A general solution is 

(4) W(.y, s) = A(s)e sxk + B(s)e~ sx,c . 

From (2) we obtain, writing F(s ) = 5£{/(/)}, 

W(0. s) = <£{iv(0, 0} = 2{/(0) = FIs)- 


Assuming that we can interchange integration and taking the limit, we have 

00 oo 

lim VV(a\ s ) = lim I e -s£ H’(.v, t) dt = I e~ st lim w(.v, t) dt = 0. 

*->00 X—GQ J n X-rOO 


This implies A(s) = 0 in (4) because c > 0, so that for every fixed positive s the function e sxfc increases as x 
increases. Note that we may assume s > 0 since a Laplace transform generally exists for all s greater than some 
fixed k (Sec. 6.2). Hence we have 


W(0, s) = B(s) = F{s) y 
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so that (4) becomes 

W(x. s ) = F(s)e~ sx,c . 


From the second shifting theorem (Sec. 6.3) with a = xlc we obtain the inverse transform 


(5) 




(Fig. 314) 


that is. 


w(x, f) = sin 



if 


-*</< — 4* 27 r or ct > x > (/ — 2tt)c 

c c 


and zero otherwise. This is a single sine wave traveling to the right with speed c. Note that a point x remains 
at rest until t = xfc, the time needed to reach that x if one starts at t — 0 (start of the motion of the left end) 
and travels with speed c. The result agrees with our physical intuition. Since we proceeded formally, we must 
verify that (5) satisfies the given conditions. We leave this to the student. H 


it = 0) I 

x 

it = 2 it) L _ZZ^1 

V7 2jtc * 


0 = 4/r) L 


(t = 6 k) I 

Fig. 314. 


XT" 




X7 

Traveling wave in Example 1 


x 


x 


This is the end of Chap. 12, in which we concentrated on the most important partial 
differential equations (PDEs) in physics and engineering. This is also the end of Part C 
on Fourier analysis and PDEs. 

We have seen that PDEs have various basic engineering applications, which make them 
the subject of many ongoing research projects. 

Numerics for PDEs follows in Secs. 21.4-21.7, which are independent of the other 
sections in Part E on numerics. 

In the next part. Part D on complex analysis, we turn to an area of a different nature 
that is also highly important to the engineer, as our examples and problems will show. 
This will include another approach to the (two-dimensional) Laplace equation and its 
applications in Chap. 18. 




!. Sketch a figure similar to Fig. 314 if c = 1 and f is 
“triangular” as in Example 1, Sec. 12.3. 

2. How does the speed of the wave in Example 1 depend 
on the tension and on the mass of the string? 

3. Verify the solution in Example 1. What traveling wave 
do we obtain in Example 1 in the case of a 


(nonterminating) sinusoidal motion of the left end 
starting at t = 0? 

fT6] SOLVE BY LAPLACE TRANSFORMS 

dw dw 

4. — + X— - X, w(a\ 0) = 1, w(0, t) = I 
ax at 
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5. 




dx 



w(jc, 0) = 0 if A' = 0. 
w( 0, /) = 0 if t ^ 0 


„ d 1 2 3 w n d 2 w ^ dw 

6. — 2 - = 100-tt* + 100— 4- 25w y 
dx 2 dt 2 dt 


w(a*, 0) = 0 if a* = 0, vt> t (A% 0) = 0 if / fe 0, 
vi/(0, t) = sin t if r ^ 0 


7. Solve Prob. 5 by another method. 


8-10 


HEAT PROBLEM 


Find the temperature w( a% t) in a semi-infinite laterally 
insulated bar extending from a- = 0 along the a - axis 
to infinity, assuming that the initial temperature is 0, 
>v(a\ /) — > 0 as a* — » oo for every fixed t ^ 0, and 
vv(0, t) = /(?). Proceed as follows. 


8. Set up the model and show that the Laplace transform 
leads to 


Applying the convolution theorem, show that 
>*•(*, 0 = jj(t - T ) T -^ e -^i^ dr. 


9. Let w(0, t) = f(t) = u(t) (Sec. 6.3). Denote the 
corresponding iv, W, and F by w Qi W 0i and F 0 . Show 
that then in Prob. 8, 

w °^ ;) = ikz J o T- 3/2 e~ Mr> dr 



with the error function erf as defined in Problem 
Set 12.6. 

10. (Duhamel’s formula 4 5 6 7 8 9 ) Show that in Prob. 9, 

W 0 (x, s) = - e-''** 10 11 
s 


„ d z W 

sw = c 2 -^- ov=<e{w}> 

and 

IV = F(s)e~ s/5xlc ( F = 2{f}). 


and the convolution theorem gives Duhamel’s f omnia 

f* dH'o 

>v(,x. t) = f(r- r) — — dr. 

J o 


: a«a : ars a a a :a sr-a s 


IBQ3HESTIONS AND PROBLEMS 


1. Write down the three probably most important PDEs 
from memory and state their main applications. 

2. What is the method of separating variables for PDEs? 
Give an example from memory. 

3. What is the superposition principle? Give a typical 
application. 

4. What role did Fourier series play in this chapter? Fourier 
integrals? 

5. What are the eigenfunctions and their frequencies of the 
vibrating string? Of the heat equation? 

6. What additional conditions did we consider for the wave 
equation? For the heat equation? 

7. Name and explain the three kinds of boundary conditions. 

8. What do you know about types of PDEs? About 
transformation to normal forms? 

9. What is d’Alembert’s method? To what PDE does it 
apply? 

10. When and why did we use polar coordinates? Spherical 
coordinates? 

11. When and why did Legendre’s equation occur in this 
chapter? Bessel’s equation? 


12. What are the eigenfunctions of the circular membrane? 
How do their frequencies differ in principle from those 
of the eigenfunctions of the vibrating string? 

13. Explain mathematically (not physically) why we got 
exponential functions in separating the heat equation, 
but not for the wave equation. 

14. What is the error function? Why did it occur and where? 

15. Explain the idea of using Laplace transform methods 
for PDEs. Give an example from memory. 

16. For what k and m are x 4 4- kx 2 y 2 4- y 4 and 
sin rnx sinh y solutions of Laplace’s equation? 

17. Verify that (. x 2 — y 2 )/( x 2 4- y 2 ) 2 satisfies Laplace’s 
equation. 


18-21 

18. 

"vu 

19. 

U XX 

20. 

Uxy 

21. 

u yy 


Solve for u = u{ x, y): 

4- 16w = 0 

4- u x — 2u — 0 

4- tty 4- a * 4- y + 1 =0 

+ Uy = o, u(x, 0) = /( x), Uy(x, 0) = g(jf) 


22. Find all solution u(x, y) = F{x)G{y) of Laplace’s 
equation in two variables. 


4 JEAN-MARIE CONSTANT DUHAMEL (1797-1872), French mathematician. 
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23-26 Find and sketch or graph (as in Fig. 285 
in Sec. 12 . 3 ) the deflection u(x , t) of a vibrating string of 
length 7T . extending from .v = 0 to x = i r, and 
c 2 = Tip = l , starting with velocity 0 and deflection 

23. fix) = sin a - | sin 2a* 

24. /( x) = \x - %ir\ 

25. fix) = sin 3 .v 

26. f(x) = A*( 7T — A) 

1 27-30 1 Find the temperature distribution in a laterally 
insulated thin copper bar ( c 2 = Klpa = 1.158 cm 2 /sec), 
50 cm long and of constant cross section with endpoints at 
x = 0 and 50 kept at 0°C and initial temperature 

27 . /(.v) = sin (7 ta/50) 

28 . /(a) = a*(50 - x) 

29. /(a) = 25 - |25 - a* | 

30. fix) = A sin 3 (ttx/ 10) 


Find the temperature w(a*, /) in a laterally 
insulated bar of length 7r, extending from a* = 0 to x = tt , 
with c 2 = 1 for adiabatic boundary condition (see Problem 
Set 12.5) and initial temperature 

31. 100 cos 4a 

32. 3.v 2 

33. tt — 2\x - \tt\ 


where 

Bmn = ~2 J J /(** >’) sin WY sin n y dx dy- 

36. Find the temperature in Prob. 35 if 
fix , y) = a*( 7T - A-)y(7T - y). 


37—42 Transform to normal form and solve (showing 
the details!) 

37 . U x y H XX 

38 . U XX + Alt X y 4- Allyy = 0 

lt xx ^ lt uv ~~ ® 

40 . 2U XX + 5 li X y + 2 Uyy = 0 

41 . li xx + 2 u xy + lt yy ~ ® 

42 . Uyy 4 U X y — 2 u xx — 0 

Show that the following membranes of area 1 
with c 2 = 1 have the frequencies of the fundamental mode 
as given (4-decimal values). Compare. 

43 . Circle: a^lVir) = 0.6784 

44. Square: lV2 = 0.7071 

45. Rectangle (sides 1 :2): VH/8 = 0.7906 

46 . Semicircle: 3.832 iV&r = 0.7644 

47 . Quadrant of circle: q 12 /(4 Vtt) = 0.7244 
(a l2 = 5.13562 = first positive zero of J 2 ) 


34. Using partial sums, graph m(a\ /) in Prob. 33 for several 
constant t on common axes. Do these graphs agree with 
your physical intuition? 

35. Let fix , y ) = z/(a\ y> 0) be the initial temperature in a 
thin square plate of side 7 r with edges kept at 0°C and 
faces perfectly insulated. Separating variables, obtain 
from u t = c 2 V 2 u the solution 

«(a\ y\ t) = X 2 sin mx sin n y e" c2(m2+,,2)t 

m=l ti = 1 


48-50 Find the electrostatic potential in the following 

(charge-free) regions: 

48. Between two concentric spheres of radii r 0 and r x kept 
at the potentials u 0 and u lt respectively. 

49. Between two coaxial circular cylinders of radii r 0 and 
/•j kept at the potential u 0 and u v respectively. 
(Compare with Prob. 48.) 

50. In the interior of a sphere of radius 1 kept at the 
potential /(<£) = cos 3 <j> + 3 cos <f> (referred to our 
usual spherical coordinates). 


SUMMARY OF .C H APTJEtt 'tjCT : - 

Partial Differential Equations (PDEs) 


Whereas ODEs (Chaps. 1-6) serve as models of problems involving only one 
independent variable, problems involving two or more independent variables (space 
variables or time t and one or several space variables) lead to PDEs. This accounts for 
the enormous importance of PDEs to the engineer and physicist. Most important are: 

(1) u tt = One-dimensional wave equation (Secs. 12.2-12.4) 

(2) u tt = + u yiJ ) Two-dimensional wave equation (Secs. 12.7-12.9) 
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(3) u t = c 2 r/rx One-dimensional heat equation (Secs. 12.5, 12.6) 

(4) V 2 « = Uxx + u yy = 0 Two-dimensional Laplace equation (Secs. 12.5, 12.9) 

(5) V 2 « = u xx 4- u yy + u zz = 0 Three-dimensional Laplace equation 

(Sec. 12.10). 

Equations (1) and (2) are hyperbolic, (3) is parabolic, (4) and (5) are elliptic. 

In practice, one is interested in obtaining the solution of such an equation in a 
given region satisfying given additional conditions, such as initial conditions 
(conditions at time t = 0 ) or boundary conditions (prescribed values of the solution 
u or some of its derivatives on the boundary surface S, or boundary curve C, of the 
region) or both. For ( 1 ) and (2) one prescribes two initial conditions (initial 
displacement and initial velocity). For (3) one prescribes the initial temperature 
distribution. For (4) and (5) one prescribes a boundary condition and calls the 
resulting problem a (see Sec. 1 2.5) 

Dirichlet problem if u is prescribed on S, 

Neumann problem if u n = du/dn is prescribed on S, 

Mixed problem if u is prescribed on one part of S and u n on the other. 

A general method for solving such problems is the method of separating 
variables or product method, in which one assumes solutions in the form of 
products of functions each depending on one variable only. Thus equation (1) is 
solved by setting m(jt, r) = F(x)G(t); see Sec. 12.3; similarly for (3) (see Sec. 12.5). 
Substitution into the given equation yields ordinary differential equations for F and 
G, and from these one gets infinitely many solutions F = F n and G = G n such that 
the corresponding functions 

« n ( A\ t) = F n (x)G n (r) 

are solutions of the PDE satisfying the given boundary conditions. These are the 
eigenfunctions of the problem, and the corresponding eigenvalues determine the 
frequency of the vibration (or the rapidity of the decrease of temperature in the case 
of the heat equation, etc.). To satisfy also the initial condition (or conditions), one 
must consider infinite series of the u n , whose coefficients turn out to be the Fourier 
coefficients of the functions / and g representing the given initial conditions 
(Secs. 12.3, 12.5). Hence Fourier series (and Fourier integrals) are of basic 
importance here (Secs. 12.3, 12.5, 12.6, 12.8). 

Steady-state problems are problems in which the solution does not depend on 
time t . For these, the heat equation u t = c 2 V 2 w becomes the Laplace equation. 

Before solving an initial or boundary value problem, one often transforms the 
PDE into coordinates in which the boundary of the region considered is given by 
simple formulas. Thus in polar coordinates given by x = r cos 0 , y = r sin 0 , the 
Laplacian becomes (Sec. 12.9) 

( 6 ) V 2 « = ~ u r + - 5 - u 8 a ; 

r r * 

for spherical coordinates see Sec. 12. 10. If one now separates the variables, one gets 
Bessel’s equation from (2) and ( 6 ) (vibrating circular membrane, Sec. 12.9) and 
Legendre’s equation from (5) transformed into spherical coordinates (Sec. 12.10). 
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Complex Numbers and Functions 

Complex Integration 

Power Series, Taylor Series 

Laurent Series. Residue Integration 

Conformal Mapping 

Complex Analysis and Potential Theory 


Many engineering problems can be modeled, investigated, and solved by functions of a 
complex variable. For simpler problems, some acquaintance with complex numbers will 
suffice. This is true for simpler electric circuits and mechanical vibrating systems. For 
more complicated problems in heat conduction, fluid flow, electrostatics, etc., one needs 
the theory of complex analytic functions, briefly called complex analysis. The importance 
of the latter in applied mathematics has three main reasons: 


1. Most importantly, the real and imaginary parts of an analytic function satisfy 
Laplace’s equation in two real variables. Hence two-dimensional potential problems can 
be solved by methods for analytic functions, and this is often simpler than working in 
real. 

2. Many complicated real and complex integrals in applications can be evaluated by 
the elegant methods of complex integration. 

3. Most functions in engineering mathematics are analytic functions, and their study 
as functions of a complex variable leads to a deeper understanding of their properties and 
to interrelations in complex that have no analog in real calculus. 
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chapter! 3 

Complex Numbers 
and Functions 


Complex numbers and their geometric representation in the complex plane are discussed 
in Secs. 13.1 and 13.2. Complex analysis is concerned with complex analytic functions 
as defined in Sec. 13.3. Checking for analyticity is done by the Cauchy-Riemann 
equations (Sec. 13.4). These equations are of basic importance, also because of their 
relation to Laplace’s equation. 

The remaining sections of the chapter are devoted to elementary complex functions 
(exponential, trigonometric, hyperbolic, and logarithmic functions). These generalize the 
familiar real functions of calculus. Their detailed knowledge is an absolute necessity in 
practical work, just as that of their real counterparts is in calculus. 

Prerequisite : Elementary calculus. 

References and Answers to Problems: App. 1 Part D, App. 2. 


13.1 Complex Numbers. Complex Plane 

Equations without real solutions, such as x 2 = — 1 or x 2 — l(k + 40 = 0, were observed 
early in history and led to the introduction of complex numbers. 1 By definition, a complex 
number z is an ordered pair (,r, y) of real numbers x and y, written 


z = (x, y ). 


x is called the real part and y the imaginary part of z, written 

x = Re z> y — Im z. 

By definition, two complex numbers are equal if and only if their real parts are equal 
and their imaginary parts are equal. 

(0, 1) is called the imaginary unit and is denoted by z, 

( 1 ) / = ( 0 , 1 ). 


1 First to use complex numbers for this purpose was the Italian mathematician GIROLAMO CARDANO 
(1501-1576), who found the formula for solving cubic equations. The term “complex number” was introduced 
by CARL FRIEDRICH GAUSS (see the footnote in Sec. 5.4). who also paved the way for a general use of 
complex numbers. 
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Addition, Multiplication. Notation z = x + iy 

Addition of two complex numbers zi = (a 1s ft) and z 2 = (a 2 , ft) is defined by 

(2) Zi + z 2 = (ft, ft) + (ft, ft) = (ft + ft, ft + ft). 

Multiplication is defined by 

(3) ftft = (ft, ft)(ft, 3 ; 2> = (ftft “ ftft, ftft + ftft)- 

In particular, these two definitions imply that 


(ft, 0) + (ft, 0) = (ft 4- ft, 0) 
and 

(a* 1s 0)(a* 2 , 0) = (Aift, 0) 


as for real numbers ft, ft. Hence the complex numbers “extend” the real numbers. We 
can thus write 


(a, 0) = a. Similarly, (0, y) = iy 

because by (1) and the definition of multiplication we have 

iy = (0, 1 )y = (0, l)(y, 0) = (0 • y - 1 • 0, 0-0+1 -y) = (0, y). 

Together we have by addition (a, y) = (a, 0) + (0, y) = x + iy: 

7/i practice, complex numbers z = ( x , y) are written 

(4) z = a + iy 

or z = a + yi, e.g., 17 + 4/ (instead of z4). 

Electrical engineers often write j instead if i because they need i for the current. 

If a = 0, then z = iy and is called pure imaginary. Also, (1) and (3) give 

(5) i 2 =- 1 

because by the definition of multiplication, i 2 = ii = (0, 1)(0, 1) = (—1, 0) = -1. 

For addition the standard notation (4) gives [see (2)] 

(ft + *ft) + (ft + %) = (ai + a 2 ) + i (ft + ft). 

For multiplication the standard notation gives the following very simple recipe. Multiply 
each term by each other term and use i z = —1 when it occurs [see (3)]: 

(*1 + '>l)(*2 + iyz) = *1*2 + '*1>’2 + 0*1*2 + ‘ 2 >\y2 
= (*1*2 - yiyz) + «‘(*i 3*2 + *2^i)- 


This agrees with (3). And it shows that a + iy is a more practical notation for complex 
numbers than (a, y). 
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CHAP. 13 Complex Numbers and Functions 


EXAMPLE 1 


EXAMPLE 2 


If you know vectors, you see that (2) is vector addition, whereas the multiplication (3) 
has no counterpart in the usual vector algebra. 

Real Part, Imaginary Part, Sum and Product of Complex Numbers 

Let zj = 8 + 3/ and z 2 = 9 - 2 /, Then Re z\ = 8, Ira zj = 3, Re z 2 = 9. Ira z 2 = -2 and 

Zi + z 2 = (8 + 3/) + (9 - 2 J) = 17 + /. 

-^2 = (8 + 3i)(9 - 20 = 72 + 6 + /(- 16 + 27) = 78 +1 It. ■ 


Subtraction, Division 

Subtraction and division are defined as the inverse operations of addition and 
multiplication, respectively. Thus the difference z = Z\ — Z 2 is the complex number z for 
which Zi = z + 42 * Hence by (2), 

(6) zi - z 2 = (*i - * 2 ) + Kyi ~ 3' 2 )- 


The quotient z = Zi/z 2 (^2 ^ 0) is the complex number z for which z 1 = zz 2 . If we equate 
the real and the imaginary parts on both sides of this equation, setting z = x : + iy, we 
obtain x t = x 2 x — y 2 y, y 1 = y 2 x + x 2 y. The solution is 


(7*) 


— = x + iy, 
~2 


*1*2 + M2 
-V 2 2 + )'2 2 


-y 2 .vi - xi yz 

-V 2 2 + .V2 2 


The practical rule used to get this is by multiplying numerator and denominator of t\lz 2 
by .v 2 - iy 2 and simplifiying: 


(7) 


*1 + O’l = (fl + O’l) U'2 - K’2) _ *1*2 + 3’l>2 . * 23 ’! - * 1 .V 2 

x 2 + iy 2 {x 2 + iy 2 )(x 2 - iy 2 ) x 2 + y 2 ,v 2 2 + y 2 


Difference and Quotient of Complex Numbers 

For ix = 8 + 3 i and z 2 = 9 — 2/ we get = (8 + 30 — (9 - 20 = — I + Si and 

Zi_ _ 8 + 3/ _ (8 + 3Q (9 + 2Q _ 66 + 43/ _ 66 43 . 

z 2 ~ 9 - 2» " (9 - 20(9 + 2») ~ 81+ 4 85 + 85 '' 

Check the division by multiplication to get 8 + 3/. ■ 

Complex numbers satisfy the same commutative, associative, and distributive laws as real 
numbers (see the problem set). 


Complex Plane 

This was algebra. Now comes geometry: the geometrical representation of complex 
numbers as points in the plane. This is of great practical importance. The idea is quite 
simple and natural We choose two perpendicular coordinate axes, the horizontal .r-axis, 
called the real axis, and the vertical y-axis, called the imaginary axis. On both axes we 
choose the same unit of length (Fig. 315). This is called a Cartesian coordinate system. 
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Fig. 315. The complex plane Fig. 316. The number 4 — 3/ in 

the complex plane 


We now plot a given complex number z = (jc, y) = x + iy as the point P with coordinates 
x ; v. The xy-plane in which the complex numbers are represented in this way is called the 
complex plane . 2 Figure 3 1 6 shows an example. 

Instead of saying “the point represented by z in the complex plane” we say briefly and 
simply “the point z in the complex plane." This will cause no misunderstandings. 

Addition and subtraction can now be visualized as illustrated in Figs. 317 and 318. 




Fig. 317. Addition of complex numbers Fig. 318. Subtraction of complex numbers 

Complex Conjugate Numbers 

The complex conjugate z of a complex number z = a* + iy is defined by 

z = x - iy. 

It is obtained geometrically by reflecting the point z in the real axis. Figure 319 shows 
this for z — 5 + 2/ and its conjugate z = 5 — 2/. 



Fig. 319. Complex conjugate numbers 

2 Sometimes called the Argand diagram, after the French mathematician JEAN ROBERT ARGAND 
(1768-1822), bom in Geneva and later librarian in Paris. His paper on the complex plane appeared in 1806, 
nine years after a similar memoir by the Norwegian mathematician CASPAR WESSEL ( 1 745-1818), a surveyor 
of the Danish Academy of Science. 
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CHAP. 13 Complex Numbers and Functions 


The complex conjugate is important because it permits us to switch from complex 
to real. Indeed, by multiplication, zz = a 2 + y 2 (verify!). By addition and subtraction, 
z + z = 2v, z — z = 2 iy. We thus obtain for the real part a* and the imaginary part y 
(not iy\) of z = x + iy the important formulas 

(8) Re z = x = “ (z + z), Im z = y = ■ (z - z). 

2 2 / 

If z is real, z = a, then z = z by the definition of z, and conversely. 

Working with conjugates is easy, since we have 


(9) 


fa + z 2 ) = z x + z 2 . 


teiz 2 ) — ziz 2 > 


tei “ ^ 2 ) = zi - z 2 , 



|i 

z 2 * 


EXAMPLE 3 Illustration of (8) and (9) 

Lei zi = 4 4- 3/ and c 2 = 2 + 5 1 . Then by (8), 

1 3/ + 3/ 

Im zi = — [(4 -i* 3/) - (4 - 3/)] = — — = 3. 

Also, the multiplication formula in (9) is verified by 


(^ 2 ) = (4 + 3/)(2 + 5 0 = (-7 4- 260 = -7 - 26/, 

Z,Z2 = (4 - 3/)(2 - 5/) = -7 - 26/. ■ 




1. (Powers of i) Show that i 2 = -1, / 3 = / 4 = 1, 

/ 5 = /,••• and l/i = l// 2 = -1, l// 3 = /,*•*. 

2. (Rotation) Multiplication by / is geometrically a 
counterclockwise rotation through n/2 (90°). Verify 
this by graphing z and iz and the angle of rotation for 
z = 2 4- 2/, z = — 1 — 5/, z = 4 — 3/. 

3. (Division) Verify the calculation in (7). 

4. (Multiplication) If die product of two complex numbers 
is zero, show that at least one factor must be zero. 

5. Show that z = x 4- iy is pure imaginary if and only 
ifz = -z. 

6. (Laws for conjugates) Verify (9) for z x = 24 4- 10/, 
z 2 = 4 + 6/. 


7-15 1 COMPLEX ARITHMETIC 

Let zi = 2 4- 3/ and z 2 = 4 - 5/. Showing the details 
of your work, find (in the form a* 4- iy): 

7. (5zi 4- 3 z 2 ) 2 8. ZiZ 2 


9. Re ( 1/z, 2 ) 10. Re (z 2 2 ), (Re z 2 ) 2 

11. z 2 /zj 12. Zj /z 2 , (zj/z 2 ) 


13. (4z! — z 2 ) 2 14. Zx/zj, zj/z, 

15. (zj + z 2 )/(zi - z 2 ) 


16-19 Let z = x + /v. Find: 

16. Im z 2 , (Im z) 3 

17. Re (1/z) 


18. Im [(1 4- /) 8 z 2 ] 

19. Re (1/z 2 ) 


20. (Laws of addition and multiplication) Derive the 
following laws for complex numbers from the 
corresponding laws for real numbers. 

Z\ + z 2 = z 2 + Zi, ZiZ 2 = z 2 Zi ( Commutative laws) 

tei + z 2 ) 4 z 3 = Zi + (z 2 + z 3 ), 

(Associative laws ) 
teiZ 2 )z 3 = zi(z 2 z 3 ) 

Zi(z 2 4 zz) — ZiZ 2 4 ZiZ 3 (Distributive law) 
0 + z = z + 0 = z, 
z 4 (-z) = (-z) -4 z = 0, 


z • 1 = z. 
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Polar Form of Complex Numbers. 

Powers and Roots 

The complex plane becomes even more useful and gives further insight into the arithmetic 
operations for complex numbers if besides the Ay-coordinates we also employ the usual 
polar coordinates r, 0 defined by 

(1) x = r cos 0, y= r sin 6, 

We see that then z = a* + iy takes the so-called polar form 

(2) z = r(cos 0 + i sin 0). 

r is called the absolute value or modulus of z and is denoted by \z\. Hence 

(3) \z\ = /• = Va 2 + y 2 = Vzl . 

Geometrically, \z\ is the distance of the point z from the origin (Fig. 320). Similarly, 
| Zi — Z 2 I is the distance between Zi and z 2 (Fig. 321). 

0 is called the argument of z and is denoted by arg z . Thus (Fig. 320) 

(4) 0 = arg z = arctan — (z =£ 0). 

x 


Geometrically, 0 is the directed angle from the positive A-axis to OP in Fig. 320. Here, as 
in calculus, all angles are measured in radians and positive in the counterclockwise sense. 

For z = 0 this angle 0 is undefined. (Why?) For a given z & 0 it is determined only 
up to integer multiples of 2tt since cosine and sine are periodic with period 2tt. But one 
often wants to specify a unique value of arg z of a given z =£ 0. For this reason one defines 
the principal value Arg z (with capital A!) of arg z by the double inequality 

(5) - 7 r < Arg z=i t. 

Then we have Arg z — 0 for positive real z = a, which is practical, and Arg z = 7T (not 
— 77!) for negative real z , e.g., for z = —4. The principal value ( 5 ) will be important in 
connection with roots, the complex logarithm (Sec. 13.7), and certain integrals. Obviously, 
for a given z ^ 0 the other values of arg z are arg z = Arg z - 2mr (n = ± 1, ±2, • • •)• 


Imaginary 

axis 



Fig. 320. Complex plane, polar form 
of a complex number 



Fig. 321. Distance between two 
points in the complex plane 
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CHAP. 13 Complex Numbers and Functions 


EXAMPLE 1 



Polar Form of Complex Numbers. Principal Value Arg z 

s = 1 + / (Fig. 322 ) has the polar form z = V 5 (cos £7 r + i sin 577). Hence we obtain 

|s| = V2. arg z = ^77 ± 2 «tt (/1 = 0, 1, • • ■)» and Arg z = |tt (the principal value). 
Similarly, z = 3 4 * 3 V 3 / = 6 (cos 377 + / sin 577), |z| — 6. and Arg * = 377. M 

CAUTION! In using (4), we must pay attention to the quadrant in which z lies, since 
tan 9 has period tt , so that the arguments of z and — z have the same tangent. Example : 
for $i = arg (1 -f /) and $ 2 = arg (—1 — i ) we have tan = tan 0 2 = 1. 


Triangle Inequality 

Inequalities such as x 1 < x 2 make sense for real numbers, but not in complex because 
there is no natural way of ordering complex numbers. However, inequalities between 
absolute values (which are real!), such as \zi\ < tal (meaning that zi is closer to the origin 
than z 2 ) are of great importance. The daily bread of the complex analyst is the triangle 
inequality 


( 6 ) 


\zi + Z 2 1 ^ Nil + |s 2 | 


(Fig. 323) 


which we shall use quite frequently. This inequality follows by noting that the three points 
0, zu and zi + z 2 are the vertices of a triangle (Fig. 323) with sides \zi |, \z 2 l and | zi + z 2 \, 
and one side cannot exceed the sum of the other two sides. A formal proof is left to the 
reader (Prob. 35). (The triangle degenerates if z 1 and z 2 lie on the same straight line through 
the origin.) 



Fig. 323. Triangle inequality 

By induction we obtain from (6) the generalized triangle inequality 


(6*) \zi + Z 2 + • • • + Zn\ S Nil + N 2 I + • • • + |z»|; 


that is, the absolute value of a sum cannot exceed the sum of the absolute values of the 
terms. 

EXAMPLE 2 Triangle Inequality 

If ci = 1 + / and z 2 — ”2 + 3 ?\ then (sketch a figure!) 

h + z 2 \ = |-I + 4 /| = VT7 = 4.123 < V 2 4 - Vl 3 = 5 . 020 . ■ 


Multiplication and Division in Polar Form 

This will give us a “geometrical” understanding of multiplication and division. Let 
= /'i(cos $i 4- i sin ^ x ) and 


z 2 = r 2 (cos 0 2 + i sin d 2 ). 
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EXAMPLE 3 


Multiplication. By (3) in Sec. 13.1 the product is at first 

hZ 2 = *V 2 [(cos 6 X cos 0 2 ~ sin 0* sin 8 2 ) + i(sin 0 X cos d 2 + cos 0 X sin 0 2 )]. 

The addition rules for the sine and cosine [(6) in App. A3.1] now yield 

(7) ZiZ 2 = r 1 ^* 2 [cos (0 X 4* 0 2 ) 4* i sin (0 X 4* 0 2 )]. 

Taking absolute values on both sides of (7), we see that the absolute value of a product 
equals the product of the absolute values of the factors, 

( 8 ) kiZal = \zi\\z 2 \. 

Taking arguments in (7) shows that the argument of a product equals the sum of the 
arguments of the factors, 

(9) arg (ziz 2 ) = arg Zi 4- arg z 2 (up to multiples of 27 t). 


Division. We have Zi = Ui/z 2 )z 2 . Hence |z x | = |(^i/z 2 )^ 2 l = kintal and by division 

by \z 2 \ 


( 10 ) 


h. 

Z 2 


M 

\zz\ 


(Z2 * 0). 


Similarly, arg Z\ = arg [(z^tel = arg iz\lz 2 ) + arg z 2 and by subtraction of arg z 2 


(ID 


arg — = arg z 1 — arg Z 2 (up to multiples of 27r). 

Z2 


Combining (10) and (11) we also have the analog of (7), 

(12) — = — [cos (0 X - 0 2 ) + i sin (0 X - 0 2 )]. 

z 2 > 2 

To comprehend this formula, note that it is the polar form of a complex number of absolute 
value i\lr 2 and argument 0 X — 0 2 . But these are the absolute value and argument of z x /z 2 , 
as we can see from (10), (1 1), and the polar forms of Z\ and z 2 . 

Illustration of Formulas (8)— (11) 

Let Z\ — —2 + 2/ and z 2 — 3/. Then z\z 2 = -6 — 6/. z\lz 2 = 2/3 + (2/3)/. Hence (make a sketch) 

\z lZ2 \ = 6V2 = 3V8 = klla \ Zl /z 2 \ = 2V2/3 = |j x |/|j 2 |, 
and for the arguments we obtain Arg z\ - IttM, Arg z 2 = tt/2, 



610 
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EXAMPLE 4 


Integer Powers of r. De Moivre’s Formula 

From (8) and (9) with Z\ = z 2 = z we obtain by induction for n = 0. 1. 2. • • • 

(13) z n = r n (cos n6 + t sin n0). 

Similarly, (12) with zj = 1 and z 2 = z u gives (13) for n = - 1, -2. • • • . For |z| = r = I, formula (13) becomes 
De Moivre’s formula 3 

(13*) (cos 9 + / sin 0) n = cos nO + i sin nQ. 

We can use this to express cos n9 and sin n6 in terms of powers of cos 6 and sin 6. For instance, for n = 2 we 
have on the left cos 2 0 + 21 cos Q sin 9 - sin 2 9. Taking the real and imaginary parts on both sides of (13*) 
with n - 2 gives the familiar formulas 

cos 29 = cos 2 9 - sin 2 8, sin 2$ = 2 cos 9 sin 9. 

This shows that complex methods often simplify the derivation of real formulas. Try « = 3. ■ 

Roots 

If z — w n (n = 1. 2, • • •), then to each value of w there corresponds one value of s. We 
shall immediately see that, conversely, to a given z # 0 there correspond precisely n 
distinct values of w. Each of these values is called an rcth root of z, and we write 

(14) w = Vz . 

Hence this symbol is multivalued, namely, n-valued. The n values of \^z can be obtained 
as follows. We write z and w in polar form 

z = r ( cos 6 + / sin 6) and w = R(c os <f> 4* / sin <£). 

Then the equation w n = z becomes, by De Moivre’s formula (with <f> instead of 9) 

w n = i? n ( cos n<f> + / sin n<t>) - z - r(cos 6 + / sin 9), 

The absolute values on both sides must be equal; thus, R n = r, so that R = ^7 , where 
V7 is positive real (an absolute value must be nonnegative!) and thus uniquely determined. 
Equating the arguments n<f> and 9 and recalling that 9 is determined only up to integer 
multiples of 2ir , we obtain 


n<t> = 0 + 2ki r, 


9 2Ictt 

thus 6 = 1 

n n 


where k is an integer. For k = 0, 1 , • • • , n - 1 we get n distinct values of w. Further 
integers of k would give values already obtained. For instance, k = n gives Ikirln = 2tt, 


3 ABRAHAM DE MOIVRE ( 1 667-1754), French mathematician, who pioneered the use of complex numbers 
in trigonometry and also contributed to probability theory (see Sec. 24.8). 



SEC 13.2 Polar Form of Complex Numbers. Powers and Roots 


611 


hence the w corresponding to k = 0, etc. Consequently, Vz , for z # 0, has the n distinct 
values 

( 15 ) Vz = Vr |cos 

where k = 0, !,-••,«- 1. These n values lie on a circle of radius Vr with center at 
the origin and constitute the vertices of a regular polygon of n sides. The value of Vi 
obtained by taking the principal value of arg z and k = 0 in (15) is called the principal 
value of w = Vz . 

Taking z — 1 in (15), we have \z\ = r = 1 and Arg z = 0. Then (15) gives 

nr- 2kir 2kir 

(16) V 1 = cos 1- / sin , k = 0, 1, • • • , n — 1. 

n n 


0 4- 2kir 0 + 

h / sin 

n i 


2kTT j 


These n values are called the /ith roots of unity. They lie on the circle of radius 1 and 
center 0, briefly called the unit circle (and used quite frequently!). Figures 324-326 show 
= 1, ± |V3 /, V\ = ±1, ±/ # and ^T. 

If c o denotes the value corresponding to k = 1 in (16), then the n values of VI can be 
written as 

1, a> y cu 2 , • • • , <o n - 1 . 

More generally, if is any nth root of an arbitrary complex number z (=£ 0), then the 
n values of Vz in (15) are 

(17) vt>i, Wx(o 2 f • • • , WxO) 11 " 1 

because multiplying by (o k corresponds to increasing the argument of w x by 2ki t/h. 
Formula (17) motivates the introduction of roots of unity and shows their usefulness. 



Fig. 324. Vi Fig. 325. Vl Fig. 326. Vl 


1-8 POLAR FORM 

1. 3 - 3/ 

2. 

2 i, -2/ 

Do these problems very carefully since polar forms will be 

3. -5 

4. 

5 + Z™ 

needed frequently. Represent in polar form and graph in 
the complex plane as in Fig. 322 on p. 608. (Show the 

s. ; + : 

6. 

3 V 2 + 2/ 

details of your work.) 

l - / 


— V2 — (2/3)/ 
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7. 


—6 4 5/ 
~3/ 


8 . 


2 4 3/ 
5 + 4/ 


PRINCIPAL ARGUMENT 

Determine the principal value of the argument. 

9. -1 - / 10. -20 4 i, -20 - / 

11. 4 ± 3/ 12. -7T 2 

13. 7 ± 7/ 14. (1 4 /)“ 

15. (9 + 9D 3 

CONVERSION TO x + iy 


16-20 


Represent in the form a* 4 iy and graph it in the complex 
plane. 


16. cos \tt 4- / sin (±^ 7 r) 
18. 4(cos|i t i / sin 577 ) 
20. 12 (cos §77 4 / sin §77) 


17. 3 (cos 0.2 4 i sin 0.2) 
19. cos (-1) 4 i sin (-1) 


21-25 


ROOTS 

Find and graph all roots in the complex plane. 

21. Vw 22. 

23. ^7 24. ^3 + 4/ 

25. 

26. TEAM PROJECT. Square Root, (a) Show that 
w = Vi has the values 

6 .01 

Wi = Vr ^cos - 4/stn-J , 

(18) w 2 = Vr |^cos 4 77 j 4 / sin 4 77 j J 


where sign y = 1 if y ^ 0, sign y = — 1 if y < 0, 
and all square roots of positive numbers are taken 
with positive sign. Hint: Use (10) in App. A3.1 with 
x = 0/2. 

(c) Find the square roots of 4/, 16 — 30/, and 
9 4 8V7/ by both (18) and (19) and comment on the 
work involved. 

(d) Do some further examples of your own and apply 
a method of checking your results. 

27-30 1 EQUATIONS 

Solve and graph all solutions, showing the details: 

27. z 2 - { 8 - 5 i)z + 40 - 20/ = 0 (Use (19).) 

28. z 4 4 (5 - 14/)z 2 - (24 4 10/) = 0 

29. 8z 2 - (36 - 6 i)z 4 42 - 11/ = 0 

30. z 4 4 16 = 0. Then use the solutions to factors 4 4 16 

into quadratic factors with real coefficients. 

31. CAS PROJECT. Roots of Unity and Their Graphs. 
Write a program for calculating these roots and for 
graphing them as points on the unit circle. Apply the 
program to z n = 1 with n = 2, 3, • • • , 10. Then extend 
the program to one for arbitrary roots, using an idea 
near the end of the text, and apply the program to 
examples of your choice. 

32-35 1 INEQUALITIES AND AN EQUATION 

Verify or prove as indicated. 

32. (Re and Im) Prove |Re z\ ^ |z|, |lm z\ = |z|. 

33. (Parallelogram equality) Prove 

ki + Z 2 \ 2 + |zi - Z 2 I 2 = 2 (|zi| 2 + |i 2 | 2 ). 


= —w x . 

(b) Obtain from (18) the often more practical formula 
(19) Vi = ±[V|(|z| 4 a ) 4 (signy)/ V \ (|z) 4.v)] 


Explain the name. 

34. (Triangle inequality) Verify (6) for Zi = 4 4 7/, 
z 2 = 5 4 2/. 

35. (Triangle inequality) Prove (6). 


13.3 Derivative. Analytic Function 

Our study of complex functions will involve point sets in the complex plane. Most 
important will be the following ones. 

Circles and Disks. Half-Planes 

The unit circle |z| = 1 (Fig. 327) has already occurred in Sec. 13.2. Figure 328 shows a 
general circle of radius p and center a. Its equation is 


\z~a\ = p 
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Fig. 327. Unit circle 



Fig. 328. Circle in the 
complex plane 



Fig. 329. Annulus in the 
complex plane 


because it is the set of all z whose distance | z — a\ from the center a equals p. Accordingly, 
its interior (“open circular disk”) is given by \z — a\ < p, its interior plus the circle itself 
(“closed circular disk”) by \z — a\ ^ p, and its exterior by \z — a\ > p. As an example, 
sketch this for a = 1 + i and p = 2, to make sure that you understand these inequalities. 

An open circular disk \z — a\ < p is also called a neighborhood of a or, more precisely, 
a p-neighborhood of a. And a has infinitely many of them, one for each value of 
p (> 0), and a is a point of each of them, by definition! 

In modem literature any set containing a p-neighborhood of a is also called a 
neighborhood of a. 

Figure 329 shows an open annulus (circular ring) Pi <\z — a\ < p 2 , which we shall 
need later. This is the set of all z whose distance \z - a\ from a is greater than p x but less 
than p 2 . Similarly, the closed annulus p x ^ \z — ^ p 2 includes the two circles. 

Half-Planes. By the (open) upper half-plane we mean the set of all points z = -v -f iy 
such that y > 0. Similarly, the condition y < 0 defines the lower half-plane , x > 0 the 
right half-plane, and x < 0 the left half-plane. 

For Reference: Concepts on Sets in the 
Complex Plane 

To our discussion of special sets let us add some general concepts related to sets that we 
shall need throughout Chaps. 13-18; keep in mind that you can find them here. 

By a point set in the complex plane we mean any sort of collection of finitely many 
or infinitely many points. Examples are the solutions of a quadratic equation, the points 
of a line, the points in the interior of a circle as well as the sets discussed just before. 

A set S is called open if every point of S has a neighborhood consisting entirely of 
points that belong to S. For example, the points in the interior of a circle or a square form 
an open set, and so do the points of the right half-plane Re z = x > 0. 

A set S is called connected if any two of its points can be joined by a broken line of 
finitely many straight-line segments all of whose points belong to S. An open and connected 
set is called a domain. Thus an open disk and an open annulus are domains. An open 
square with a diagonal removed is not a domain since this set is not connected. (Why?) 

The complement of a set S in the complex plane is the set of all points of the complex 
plane that do not belong to S . A set S is called closed if its complement is open. For 
example, the points on and inside the unit circle form a closed set (“closed unit disk”) 
since its complement |z| > 1 is open. 

A boundary point of a set S is a point every neighborhood of which contains both 
points that belong to S and points that do not belong to S. For example, the boundary 
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EXAMPLE 1 


EXAMPLE 2 


points of an annulus are the points on the two bounding circles. Clearly, if a set S is open, 
then no boundary point belongs to S ; if S is closed, then every boundary point belongs to 
S. The set of all boundary points of a set S is called the boundary of S. 

A region is a set consisting of a domain plus, perhaps, some or all of its boundary 
points. WARNING! “Domain” is the modem term for an open connected set. 
Nevertheless, some authors still call a domain a “region” and others make no distinction 
between the two terms. 

Complex Function 

Complex analysis is concerned with complex functions that are differentiable in some 
domain. Hence we should first say what we mean by a complex function and then define 
the concepts of limit and derivative in complex. This discussion will be similar to that in 
calculus. Nevertheless it needs great attention because it will show interesting basic 
differences between real and complex calculus. 

Recall from calculus that a real function / defined on a set S of real numbers (usually 
an interval) is a rule that assigns to every x in S a real number f(x ), called the value of 
/ at x. Now in complex, S is a set of complex numbers. And a function / defined on S is 
a rule that assigns to every z in S a complex number w, called the value of f at z . We write 

w = f(z ). 

Here z varies in S and is called a complex variable. The set S is called the domain of 
definition of f or, briefly, the domain of /. (In most cases S will be open and connected, 
thus a domain as defined just before.) 

Example: w = f(z) = z 2 4- 3z is a complex function defined for all z; that is, its domain 
S is the whole complex plane. 

The set of all values of a function / is called the range off. 

w is complex, and we write w = u 4- iv, where u and v are the real and imaginary 
parts, respectively. Now w depends on z = a* + iy. Hence u becomes a real function of x 
and y, and so does v. We may thus write 

w = f(z) = «Cy, y) 4- iv( a\ y). 

This shows that a complex function f(z ) is equivalent to a pair of real functions u(x, y) 
and v(x , y), each depending on the two real variables x and y. 

Function of a Complex Variable 

Let w = f(z) = z 2 + 3z. Find // and v and calculate the value of / at z = 1 + 3 i. 

Solution . it = Re f(z) = .v 2 - v 2 + 3.v and v = Ivy + 3 y. Also, 

/(I *4* 30 = (14 3/) 2 + 3(1 + 3i) = I - 9 + 6/ + 3 + 9i = -5 4 15/. 

This shows that m( 1, 3) = —5 and w(l, 3) = 15. Check this by using the expressions for u and v. H 

Function of a Complex Variable 

Let w = f(z) = 2 iz 4 6 z. Find u and u and the value of / at z = \ + 4/. 

Solution . f(z) = 2/(.v + iy) + 6(.v — iy) gives //(.v, y) - 6x — 2 y and u(x y y) = 2 y — 6 y. Also, 

/(£ + 4/) = 2 /(§ + 4/) + 6(| - 40 = / - 8 4 3 - 24/ = -5 - 23/. 


Check this as in Example l . 
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Remarks on Notation and Terminology 

1. Strictly speaking, f(z ) denotes the value of / at z, but it is a convenient abuse of 
language to talk about the function f(z ) (instead of the function /), thereby exhibiting the 
notation for the independent variable. 

2. We assume all functions to be single-valued relations, as usual: to each z in S there 
corresponds but one value w = f(z) (but, of course, several z may give the same value 
w = .f(z), just as in calculus). Accordingly, we shall not use the term “multivalued 
function” (used in some books on complex analysis) for a multivalued relation, in which 
to a z there corresponds more than one w. 


Limit, Continuity 

A function f(z) is said to have the limit / as z approaches a point z 0 , written 

(1) lim f(z) = /, 

2-^0 

if / is defined in a neighborhood of zq (except perhaps at zo itself) and if the values 
of f are “close” to / for all z “close” to z 0 ; in precise terms, if for every positive real e 
we can find a positive real 8 such that for all z =£ Zq in the disk | z — Zo\ < 8 (Fig- 330) 
we have 

(2) |/(z) — /| < e; 

geometrically, if for every z i 1 z 0 in that S-disk the value of f lies in the disk (2). 

Formally, this definition is similar to that in calculus, but there is a big difference. 
Whereas in the real case, x can approach an x 0 only along the real line, here, by definition, 
z may approach Zofrom any direction in the complex plane. This will be quite essential 
in what follows. 

If a limit exists, it is unique. (See Team Project 26.) 

A function f(z) is said to be continuous at z = Zq if f(zo) is defined and 

(3) lim f(z) = /(z 0 )- 

Z -* 2 0 

Note that by definition of a limit this implies that f(z) is defined in some neighborhood 
of Zo- 

f(z ) is said to be continuous in a domain if it is continuous at each point of this domain. 



Fig. 330. Limit 
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EXAMPLE 3 


EXAMPLE 4 


Derivative 

The derivative of a complex function / at a point z 0 is written f'(z 0 ) and is defined by 


(4) 


f'(z 0 ) = lim 

Az-* 0 


f(zo + Az) f(zp) 
Az 


provided tliis limit exists. Then / is said to be differentiable at Zq. If we write Az = z — Zo, 
we have z = Zo + Az and (4) takes the form 


(4') 


f'(Zo) = lim 

z—>z 0 


f(z) ~ f(Zp) 
Z - Zo 


Now comes an important point Remember that, by the definition of limit, /(z) is defined 
in a neighborhood of z 0 and z in (4') may approach zo from any direction in the complex 
plane. Hence differentiability at Zo means that, along whatever path z approaches Zo> the 
quotient in (4') always approaches a certain value and all these values are equal. This is 
important and should be kept in mind. 


Differentiability. Derivative 

The function f(z) = z 2 is differentiable for all z and has the derivative f\z ) = 2z because 


f'(z)= lim 

As— 0 


(z + Az) 2 - z 2 


Az 


= lim 
Ar— 1-0 


z + 2zA z + (Az)' 

A T 


„v2 _ _2 


= lim (2z + Az) = 2z. 

AZ^O 


The differentiation rules are the same as in real calculus, since their proofs are literally 
the same. Thus for any analytic functions / and g and constants c we have 


(cf)’ = cf, ( / + g)' = f' + g, ( fg )' = f'g + fg', 



fg - 
„2 


as well as the chain rule and the power rule ( z n Y = nz n 1 (n integer). 

Also, if /(z) is differentiable at zo» it is continuous at Zo- (See Team Project 26.) 

z not Differentiable 

It may come as a surprise that there are many complex functions that do not have a derivative at any point. For 
instance. f(z) = z = .v — iy is such a function. To see this, we write Az = Ax + /Ay and obtain 

f{z -I- Az) - /(z) _ (z + Az) - z Az Ax - /Av 

^ Az ~ Az Az Av + /Ay 

If Ay - 0, this is + 1 . If A.v = 0, this is — I . Thus (5) approaches + 1 along path I in Fig. 33 1 but - 1 along 
path n. Hence, by definition, the limit of (5) as Az — » 0 does not exist at any z. ■ 



Fig. 331. Paths in (5) 



SEC 13.3 Derivative. Analytic Function 


617 


Surprising as Example 4 may be, it merely illustrates that differentiability of a complex 
function is a rather severe requirement. 

The idea of proof (approach of z horn different directions) is basic and will be used 
again as the crucial argument in the next section. 

Analytic Functions 

Complex analysis is concerned with the theory and application of “analytic functions/* 
that is, functions that are differentiable in some domain, so that we can do “calculus in 
complex.** The definition is as follows. 


DEFINITION 


Analyticity 

A function f(z) is said to be analytic in a domain D if f(z) is defined and 
differentiable at all points of D. The function /(z) is said to be analytic at a point 
z = Zo in D if /(z) is analytic in a neighborhood of z 0 . 

Also, by an analytic function we mean a function that is analytic in some domain. 


Hence analyticity of f(z) at z. 0 means that /(z) has a derivative at every point in some 
neighborhood of Zo (including Zo itself since, by definition, Zo is a point of all its 
neighborhoods). This concept is motivated by the fact that it is of no practical interest if 
a function is differentiable merely at a single point Zo but not throughout some 
neighborhood of Zo- Team Project 26 gives an example. 

A more modern term for analytic in D is holomorphic in D. 


EXAMPLE 5 Polynomials, Rational Functions 

The nonnegative integer powers 1 f z, z 2 , • • ■ are analytic in the entire complex plane, and so are polynomials, 
dial is, functions of the form 

f(z) — c 0 + c x z + c 2 z 2 + ’ * * + c M c w 


where cq, • ■ ■ , c„ are complex constants. 

The quotient of two polynomials g(s) and h(z). 



is called a rational function. This / is analytic except at the points where h(z) = 0; here we assume that common 
factors of g and h have been canceled. 

Many further analytic functions will be considered in the next sections and chapters. H 


The concepts discussed in this section extend familial* concepts of calculus. Most important 
is the concept of an analytic function, the exclusive concern of complex analysis. Although 
many simple functions are not analytic, the large variety of remaining functions will yield 
a most beautiful branch of mathematics that is very useful in engineering and physics. 




1-10 


CURVES AND REGIONS OF 
PRACTICAL INTEREST 


Find and sketch or graph the sets in the complex plane given 
by 

1. |z - 3 - 2/| = | 2. 1 S \z - I + 4t| S 5 


3. 0 < \z - 1 1 < 1 
5. Imz 2 = 2 
7. \z + 1| = \z - 1| 
9. Re z S Im z 


4. — TT < Re Z < TT 


6 . Re z > — I 
8. |Arg z| S jtt 
10. Re (1/z) < 1 
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11. WRITING PROJECT. Sets in the Complex Plane. 
Extend the part of the text on sets in the complex plane 
by formulating that part in your own words and 
including examples of your own and comparing with 
calculus when applicable. 


COMPLEX FUNCTIONS AND DERIVATIVES 




2-15 


Function Values. Find Re / and Im /. Also find 
their values at the given point z. 

12. / = 3z 2 - 6z + 3 i, z = 2 + i 

13 . / = z/(z + 1 ), z = 4 - 5 / 

14 . / = 1/(1 - z).z = h+ 4 ' 

15. f = 1/z 2 , z = 1 + / 


16-19 1 Continuity. Find out (and give reason) whether 
f(z) is continuous at z = 0 if /( 0) = 0 and for z =£ 0 the 
function f is equal to: 

16. [Re (z 2 )]/|z| 2 17. [Im (z 2 )]/|z| 

18. |z| 2 Re (1/z) 19. (Imz)/(1 - |z|) 


1 20-24 1 Derivative. Differentiate 

20. (z 2 - 9)/(z 2 +1) 21. (z 3 + if 

21 . (3z + 4 /)/( 1.5/z - 2) 23 . //( 1 - zf 
24 . z 2 /(z + if 


25. CAS PROJECT. Graphing Functions. Find and 
graph Re /, Im /, and |/| as surfaces over the z-plane. 
Also graph the two families of curves Re /(z) = const 
and Im /(z) = const in the same figure, and the curves 
|/(z)| = const in another figure, where (a) f(z) = z 2 . 
(b) /(z) = 1/z, (c) f(z) = z 4 . 

26. TEAM PROJECT. Limit. Continuity, Derivative 

(a) Limit. Prove that (1) is equivalent to the pair of 
relations 

lim Re f(z) = Re /, lim Im /(z) = Im /. 

Z—*Zq 2—*2q 

(b) Limit. If lim /(z) exists, show that this limit is 
unique. 

(c) Continuity. If z lf Z 2 , * * * are complex numbers for 
which lim z n = a , and if /(z) is continuous at 

n— ►o o 

z = a. show that lim f(zn) = f(a). 

»»— sc 

(d) Continuity. If /(z) is differentiable at Zo* show that 
/(z) is continuous at Zo- 

(e) Differentiability. Show that /(z) = Re z = x is 
not differentiable at any z. Can you find other such 
functions? 

(f) Differentiability. Show that /(z) = |z| 2 is 
differentiable only at z = 0; hence it is nowhere analytic. 


13.4 Cauchy-Riemann Equations. 

Laplace’s Equation 

The Cauchy-Riemann equations are the most important equations in this chapter and 
one of the pillars on which complex analysis rests. They provide a criterion (a test) for 
the analyticity of a complex function 

w = /(z) = u(x, y) 4- iv(x y y). 

Roughly, / is analytic in a domain D if and only if the first partial derivatives of u and 
v satisfy the two Cauchy-Riemann equations 4 

( 1 ) U X Vy, Uy V X 


4 The French mathematician AUGUSTIN-LOUIS CAUCHY (see Sec. 2.5) and the German mathematicians 
BERNHARD R1EMANN (1826-1866) and KARL WEIERSTRASS (1815-1897: see also Sec. 15.5) are the 
founders of complex analysis. Riemann received his Ph.D. (in 1851) under Gauss (Sec. 5.4) at Gottingen, where 
he also taught until he died, when he was only 39 years old. He introduced the concept of the integral as it is 
used in basic calculus courses, and made important contributions to differential equations, number theory, and 
mathematical physics. He also developed the so-called Riemannian geometry, which is the mathematical 
foundation of Einstein's theory of relativity; see Ref. IGR9J in App. I. 
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THEOREM 1 


PROOF 


everywhere in D; here u x = bulb x and u y = bulby (and similarly for v) are the usual 
notations for partial derivatives. The precise formulation of this statement is given in 
Theorems 1 and 2. 

Example: f(z ) = z 2 = x 2 — y 2 4 2 ixy is analytic for all z (see Example 3 in Sec. 13.3), 
and u = x 2 — y 2 and v = 2 xy satisfy (1), namely, u x = 2x = v y as well as it y = — 2y = ~v x . 
More examples will follow. 


Cauchy-Riemann Equations 

Let f(z) = u(x, y) + /u(x, y) be defined and continuous in some neighborhood of a 
point z = x 4- iy and differentiable at z itself. Then at that point , the first-order 
partial derivatives of u and v exist and satisfy the Cauchy-Riemann equations ( 1 ). 

Hence if f(z) is analytic in a domain D, those partial derivatives exist and satisfy 
(1) at all points of D. 


By assumption, the derivative f\z ) at z exists. It is given by 


( 2 ) 


f\z) = lim 

Az— *0 


f(z + A z) ~ f(z) 
Az 


The idea of the proof is very simple. By the definition of a limit in complex (Sec. 13.3) 
we can let Az approach zero along any path in a neighborhood of z. Thus we may choose 
the two paths I and II in Fig. 332 and equate the results. By comparing the real parts we 
shall obtain the first Cauchy-Riemann equation and by comparing the imaginary parts the 
second. The technical details are as follows. 

We write Az = A.y + /Ay. Then z + Az = x + Ax + i(y + Ay), and in terms of u and 
v the derivative in (2) becomes 


(3) 




[u(x + Ax, y + Ay) -1- iu(x + Ax, y + Ay)] — [w(x, y) 4- iv(x, y)] 

Ax 4 /Ay 


We first choose path I in Fig. 332. Thus we let Ay — » 0 first and then Ax — » 0. After Ay 
is zero, Az = Ax. Then (3) becomes, if we first write the two w-terms and then the two 
u-terms, 


f'(z) = lim 

Ax— ►() 


u(x + Ax, y) — u(x, y) 
A* 


v(x + Ax, y) - v(x, y) 

+ / lim r 

o Ax 



Fig. 332. Paths in (2) 




620 


CHAP. 13 Complex Numbers and Functions 


EXAMPLE 1 


THEOREM 2 


Since f\z) exists, the two real limits on the right exist. By definition, they are the partial 
derivatives of u and v with respect to a. Hence the derivative f(z) of f(z) can be written 

(4) f'(z) = u x + iv x . 


Similarly, if we choose path II in Fig. 332, we let Aa — » 0 first and then Ay — > 0. After 
Aa is zero, Az = zAy, so that from (3) we now obtain 


,,, x wte v + Ay) - u( a, y) t . v(x, y + Ay) - v(x 9 .y) 

j (z) = hrn 4* z lim — 

o / Ay Ay — *-0 z Ay 


Since f r (z) exists, the limits on the right exist and give the partial derivatives of u and v 
with respect to y; noting that 1/z = — z, we thus obtain 


(5) f'(z) = -iuy + v y . 

The existence of the derivative f'(z) thus implies the existence of the four partial 
derivatives in (4) and (5). By equating the real parts u x and u y in (4) and (5) we obtain 
the first Cauchy-Riemann equation ( 1 ). Equating the imaginary parts gives the other. This 
proves the first statement of the theorem and implies the second because of the definition 
of analyticity. ■ 

Formulas (4) and (5) are also quite practical for calculating derivatives f'(z ), as we shall 
see. 


Cauchy-Riemann Equations 

f(z) = z 2 is analytic for all z. It follows that the Cauchy-Riemann equations must be satisfied (as wc have 
verified above). 

For f(z) — f = jc — iy we have u - a\ u = — v and see that the second Cauchy-Riemann equation is satisfied, 
Uy = — u x = 0, but the first is not: u x — 1 ^ v y = — 1. We conclude that f(z) = z is not analytic, confirming 
Example 4 of Sec. 13.3. Note the savings in calculation! I 

The Cauchy-Riemann equations are fundamental because they are not only necessary 
but also sufficient for a function to be analytic. More precisely, the following theorem 
holds. 


Cauchy-Riemann Equations 

If two real-valued continuous functions u( a, y) and v(x % y) of two real variables x 
and y have continuous first partial derivatives that satisfy the Cauchy-Riemann 
equations in some domain D, then the complex function f(z) — u( a, y) + iv( a, y) is 
analytic in D. 


The proof is more involved than that of Theorem 1 and we leave it optional (see App. 4). 

Theorems 1 and 2 are of great practical importance, since by using the 
Cauchy-Riemann equations we can now easily find out whether or not a given complex 
function is analytic. 
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EXAMPLE 2 


EXAMPLE 3 


Cauchy-Riemann Equations. Exponential Function 

Is f{z) = «(.v, v) + iu(x. y) = e x {cosy + i sin y) analytic? 

Solution . We have u = e x cos y\ v = e x sin y and by differentiation 

u x = e x cos y, = c* cos y 

iq, = -c* sin y, y^. = <?* sinv. 

We see that the Cauchy-Riemann equations are satisfied and conclude that f(z) is analytic for all z - (J“(c) will 
be the complex analog of e x known from calculus.) ■ 

An Analytic Function of Constant Absolute Value Is Constant 

The Cauchy-Riemann equations also help in deriving general properties of analytic functions. 

For instance, show that if f(z) is analytic in a domain D and |/(-)| = k = const in D, then /(c) = const in 
D. (We shall make crucial use of this in Sec. 18.6 in the proof of Theorem 3.) 

Solution. By assumption, |/| 2 = |» + iv | 2 = u 2 + y 2 = k 2 . By differentiation. 

uit x + vv x = 0 . 

UUy + Wy = 0. 

Now use v x = —iiy in the first equation and v y = u x in the second, to get 

(a) uu x — vtiy — 0. 

(6) 

(b) itity + vu x = 0. 

To get rid of u y . multiply (6a) by u and (6b) by u and add. Similarly, to eliminate u x , multiply (6a) by — v and 
(6b) by « and add. This yields 

l it 2 + v 2 )u x = 0 . 

(M 2 + V 2 )liy = 0. 

If k 2 = u 2 + y 2 = 0. then it = v = 0: hence / = 0. Tf k 2 — tt 2 + v 2 ¥* 0, then u x = u y = 0. Hence, by 
the Cauchy-Riemann equations, also v x = v tJ = 0. Together this implies it = const and y = const: hence 
/ = const. H 

We mention that if we use the polar form z = r(cos 6 + i sin 6) and set 
f(z) = w(/\ 6) + iv(j\ 0), then the Cauchy-Riemann equations are (Prob. 11) 

1 

Uy = “ V 09 

(7) ' (r > 0). 

1 

v r = - — u g 


Laplace's Equation. Harmonic Functions 

The great importance of complex analysis in engineering mathematics results mainly from 
the fact that both the real part and the imaginary part of an analytic function satisfy 
Laplace’s equation, the most important PDE of physics, which occurs in gravitation, 
electrostatics, fluid flow, heat conduction, and so on (see Chaps. 12 and 18). 
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THEOREM 3 


PROOF 


EXAMPLE 4 


Laplace’s Equation 

If f(z ) = m(a\ y) + iv( x y y) is analytic in a domain D, then both u and v satisfy 

Laplace’s equation 

( 8 ) V 2 li = + Uyy = 0 

(V 2 read “nabla squared”) and 

( 9 ) = V XX + Vyy = 0 , 

in D and have continuous second partial derivatives in D. 


Differentiating u x = v y with respect to x and u y = —v x with respect to >■, we have 


( 10 ) 


U.TX ~ V, 


yxf 


yy 


= —V 


xy 


Now the derivative of an analytic function is itself analytic, as we shall prove later (in 
Sec. 14.4). This implies that u and v have continuous partial derivatives of all orders; in 
particular, the mixed second derivatives are equal: v yx = v xy . By adding (10) we thus 
obtain (8). Similarly, (9) is obtained by differentiating u x = v y with respect to y and 
Uy = —v x with respect to x and subtracting, using u xy = u yx . ■ 

Solutions of Laplace’s equation having continuous second-order partial derivatives 
are called harmonic functions and their theory is called potential theory (see also 
Sec. 12.10). Hence the real and imaginary parts of an analytic function are harmonic 
functions. 

If two harmonic functions u and v satisfy the Cauchy-Riemann equations in a domain 
D, they are the real and imaginary parts of an analytic function / in D. Then u is said to 
be a harmonic conjugate function of u in D. (Of course, this has absolutely nothing to 
do with the use of “conjugate” for J.) 


How to Find a Harmonic Conjugate Function by the Cauchy-Riemann Equations 

Verify that u = ,v 2 — y 2 ~ y is harmonic in the whole complex plane and find a harmonic conjugate function 
u of u. 

Solution . V 2 h = 0 by direct calculation. Now u x = 2.v and u y — — 2v — 1. Hence because of the 
Cauchy-Riemann equations a conjugate v of u must satisfy 

Vy = tt x = 2\\ v x = -Uy = 2 y + 1. 

Integrating the first equation with respect to v and differentiating the result with respect to ,v. we obtain 


v = 2.vy + /i(.v). 



A comparison with the second equation shows that dhidx = 1. This gives h(x) = x + c. Hence v = 2.vy + .v -f c 
{c any real constant) is the most general harmonic conjugate of the given u. The corresponding analytic function is 

f(z) — u + iv = .V 2 — y 2 — y + /(2.vy + .v + c) = z 2 + iz + ic. I 
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Example 4 illustrates that a conjugate of a given harmonic function is uniquely determined 
up to an arbitrary real additive constant. 

The Cauchy-Riemann equations are the most important equations in this chapter. Their 
relation to Laplace’s equation opens wide ranges of engineering and physical applications, 
as we shall show in Chap. 18. 




1 1-10 1 CAUCHY-RIEMANN EQUATIONS 

22. 

Are the following functions analytic? [Use (1) or (7).] 

24. 

L fiz) = z 4 

2. f(z) = Im ( z 2 ) 

25. 

3. £ 2,r (cos y 4- / sin y) 

4. f{z) = I/d - z 4 ) 


5. ^ _a; (cos y — i sin y) 

6. Hz) ~ Arg vz 


7. f(z) — Re z 4" 1m z 

8. f{z) = In |s| + i Arg z 

26. 

9. fiz) = i/z s 

10. f(z) = z 2 + Uz 2 



tt = e 3x cos ay 23. ti — sin x cosh cy 

u = ax s 4- by 3 

(Harmonic conjugate) Show that if u is harmonic and 
v is a harmonic conjugate of w, then // is a harmonic 
conjugate of —v. 

TEAM PROJECT. Conditions for f(z) = const Let 
fix) be analytic. Prove that each of the following 
conditions is sufficient for fiz) = const. 


11. (Cauchy-Riemann equations in polar form) Derive 
(7) from (1). 


12-21 


HARMONIC FUNCTIONS 


Are the following functions harmonic? If your answer is 
yes, find a corresponding analytic function 
fiz) = it (a, y) + iv(x, y). 


12. u = xy 

14. v - -y/( x 2 4- y 2 ) 
16. v = In \z\ 

18. u = l/(.v 2 4- y 2 ) 
20. u = cos x cosh y 


13. v = A'V 
15. it — In |z| 

17. u = a * 3 - 3av 2 
19. v = (.v 2 - y 2 ) 2 
21. u — e~ x sin 2y 


1 2 2-24 1 Determine a , b, c such tliat the given functions 
are harmonic and find a harmonic conjugate. 


(a) Re f(z) = const 

(b) Im f(z) = const 

(c) f'(z) = 0 

(d) \f(z)\ = const (see Example 3) 

27. (Two further formulas for the derivative). Formulas 
(4), (5), and (11) (below) are needed from time to time. 
Derive 

(11) f'(z) = tt x - iu y , f\z) = v y + iv x . 

28. CAS PROJECT. Equipotential Lines. Write a 
program for graphing equipotential lines u = const of 
a harmonic function it and of its conjugate v on the 
same axes. Apply the program to (a) it = a' 2 — y 2 , 
v = 2xy, (b) it = .v 3 - 3 av 2 , u = 3.v 2 y - y 3 , 
(c) u = e x cos v, u ~ e x sin y. 


13.5 Exponential Function 

In the remaining sections of this chapter we discuss the basic elementary complex 
functions, the exponential function, trigonometric functions, logarithm, and so on. They 
will be counterparts to the familiar functions of calculus, to which they reduce when 
z = a* is real. They are indispensable throughout applications, and some of them have 
interesting properties not shared by their real counterparts. 

We begin with one of the most important analytic functions, the complex exponential 
function 

e z y also written exp z. 

The definition of e z in terms of the real functions e x , cos y, and sin y is 


( 1 ) 


e z = e*(cos y + / sin >•). 
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This definition is motivated by the fact the e z extends the real exponential function e x of 
calculus in a natural fashion. Namely: 

(A) e z = e x for real z = x because cos y = 1 and sin y = 0 when y = 0. 

(B) e z is analytic for all z. (Proved in Example 2 of Sec. 13.4.) 

(O The derivative of e* is e z , that is, 

(2) (e 2 )' = 

This follows from (4) in Sec. 13.4, 


( e z ) f = ( e x cos y) x 4- i(e x siny)* = e x cosy 4* ie x sin y = e z . 


REMARK. This definition provides for a relatively simple discussion. We could define e z by 
the familial* series 1 4- x + x 2 I2\ 4* a 3 /3! 4* • • • with a* replaced by z, but we would then have 
to discuss complex series at this very early stage. (We will show the connection in Sec. 15.4.) 

Further Properties. A function f(z) that is analytic for all z is called an entire function. 
Thus, e? is entire. Just as in calculus the functional relation 

(3) e - e e 

holds for any Z\ — 4- /y x and z 2 = *2 + Indeed, by (1), 

<?V 2 = ^ l (cosy! 4- / sin yj) ^ 2 (cos y 2 4- / siny 2 ). 

Since /V* 2 = ^ l+ * 2 for these ra?/ functions, by an application of the addition formulas 
for the cosine and sine functions (similar to that in Sec. 13.2) we see that 

<?V 2 = e Xl + x * [cos (.Vi + y 2 ) + > sin (y t + >> 2 )] = e x+z * 

as asserted. An interesting special case of (3) is Z\ = a\ Z 2 = OV then 

(4) e 2 = e x e iy . 

Furthermore, for z = iy we have from (1) the so-called Euler formula 

(5) e zy = cos y 4* i sin y. 


Hence the polar form of a complex number, z = r(cos 0 4- / sin 0), may now be written 

(6) z = re i0 . 

From (5) we obtain 

(7) e 2 ™ = 1 


as well as the important formulas (verify!) 

(8) e' ril2 = i, e™ = -1, e - ^ 2 = -/, 
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EXAMPLE 1 


Another consequence of (5) is 

(9) \e w \ = |cos y 4- i sin y| = Vcos 2 y + sin 2 y = 1. 

That is, for pure imaginary exponents the exponential function has absolute value 1, a 
result you should remember. From (9) and (1), 

(10) \e z \ = e x . Hence arg e z = y ± 2nir (n = 0, 1, 2, • • •)» 

since \e z \ = e x shows that ( 1 ) is actually e z in polar form. 

From \e^\ — e x =£ 0 in (10) we see that 

( 11 ) # * 0 for all z. 

So here we have an entire function that never vanishes, in contrast to (nonconstant) 
polynomials, which are also entire (Example 5 in Sec. 13.3) but always have a zero, as 
is proved in algebra. 

Periodicity of e z with period 27ri, 

(12) e z+27ri = e* for all z 

is a basic property that follows from (1) and the periodicity of cos y and sin y. Hence all 
the values that w = e 2 can assume are already assumed in the horizontal strip of width 

2 7T 

(13) —Tr<yfk 77 (Fig. 333). 

This infinite strip is called a fundamental region of e z . 

Function Values. Solution of Equations. 

Computation of values from ( 1 ) provides no problem. For instance, verify that 

e l.4-0.6i _ e l 4 (cos 0 6 _ . sJn 0 6) = 4.055(0.8253 - 0.5646/) = 3.347 - 2.289/ 

| e i.4-o.6i| _ e i.4 _ 4 055i ^ ^1.4— 0.6/ _ _ 0 6 

To illustrate (3), take the product of 

e 2+1 = ^ 2 (cos 1 + / sin 1) and e 4 "* = ^ 4 (cos 1 — / sin 1) 
and verify that it equals eV*(cos 2 1 + sin 2 1) = e 6 — e (2+i)+(4-t) . 

^1 




Fig. 333. Fundamental region of the 
exponential function e z in the z-plane 
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To solve the equation e z = 3 + 4 /, note first that |^| = e x = 5, x = In 5 = 1.609 is the real part of all 
solutions. Now, since e x — 5, 

e x cos y = 3, e x sin v = 4. cos v = 0.6. sin v = 0.8. v = 0.927. 

Ans . z = 1 .609 -I- 0.927/ ± Irnri (;/ — 0. 1. 2. • • •). These are infinitely many solutions (due to the periodicity 
of e z ). They lie on the vertical line jc = 1 .609 at a distance In from their neighbors. ■ 

To summarize: many properties of e z = exp z parallel those of e x \ an exception is the 
periodicity of e z with 2 t ti, which suggested the concept of a fundamental region. Keep in 
mind that e z is an entire function. (Do you still remember what that means?) 


'ROB L E MEr5 : ET— 1 3 ♦ 5 ~ 


1. Using the Cauchy-Riemann equations, show that e z is 
entire. 


2-8 


Values of e z . Compute <4 in the fonn u + iv and 


\e% where z equals: 
2. 3 + 7 ri 
4. V2 — \iri 
6. (1 4 - /)tt 
8. 977/72 


3. 1 + 2/ 
5. 7 7n/2 
7. 0.8 - 5/ 


9-12 


Real and Imaginary Parts. Find Re and lm of: 

9 *“ ** 

,2 


11 . 


13-17 


13. Vi 
15. ^ 
17. -9 


10. e* 

12. e Vz 


Polar Form. Write in polar form: 
14. 1 + i 
16. 3 + 4/ 


18-21 


Equations. Find all solutions and graph some of 


them in the complex plane. 


18. e** = 4 19. e 2 = -2 

20. e* = 0 21. <? 2 = 4 — 3/ 


22. TEAM PROJECT. Further Properties of the 
Exponential Function, (a) Analyticity. Show that e z 
is entire. What about <? 1/2 ? e z l e x (cos ky + i sin ky)2 
(Use the Cauchy-Riemann equations.) 

(b) Special values. Find all z such that (i) e z is real, 
(ii) \e“ z \ < 1 , (iii) e z = e z . 

(c) Harmonic function. Show that 

u = e xy cos (a- 2 /2 — y 2 /2) is harmonic and find a 
conjugate. 

(d) Uniqueness. It is interesting that f(z ) = e z is 
uniquely determined by the two properties 
f{x + / 0) = e x andf'(z) = /(z), where / is assumed 
to be entire. Prove this using the Cauchy-Riemann 
equations. 


13.6 Trigonometric and Hyperbolic Functions 

Just as we extended the real e x to the complex e z in Sec. 13.5, we now want to extend 
the familiar real trigonometric functions to complex trigonometric functions. We can do 
this by the use of the Euler formulas (Sec. 13.5) 

e ux = cos x + / sin a, e~ w = cos x - / sin x. 

By addition and subtraction we obtain for the real cosine and sine 

cos x = — - (e™ 4* £ _w? ), sin x = — (e™ — e~™). 

2 21 

This suggests the following definitions for complex values z = x + iy: 
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EXAMPLE 1 


( 1 ) 


cos z = — {e 7Z -F e **), 


sin *' a 




It is quite remarkable that here in complex, functions come together that are unrelated in 
real. This is not an isolated incident but is typical of the general situation and shows the 
advantage of working in complex. 

Furthermore, as in calculus we define 


( 2 ) 

and 


tan z = 


sin z 
cos z 


cotz = 


cosz 
sin z 


(3) 


1 

sec z = , 

cos z 


1 

CSC z = — . 

sm z 


Since e 2 is entire, cos z and sin z are entire functions, tan z and sec z are not entire; they 
are analytic except at the points where cos z is zero; and cot z and esc z are analytic except 
where sin z is zero. Formulas for the derivatives follow readily from (e z ) f = e z and (1)— (3); 
as in calculus, 

(4) (cosz)* = — sinz, (sinz/ = cosz, (tanz) ; = sec 2 z, 
etc. Equation (1) also shows that Euler’s formula is valid in complex: 


(5) 


e* = cos z + / sin z 


for all z. 


The real and imaginary parts of cos z and sin z are needed in computing values, and 
they also help in displaying properties of our functions. We illustrate this with a typical 
example. 

Real and Imaginary Parts. Absolute Value. Periodicity 

Show that 



(a) 

cos z = cos a cosh y — i sin a* sinh y 

(6) 

(b) 

sin z — sin a* cosh y + i cos a* sinh y 

and 



(7) 


(a) |cos z\ 2 — cos 2 x + sinh 2 y 

(b) |sin z\ 2 = sin 2 x + sinh 2 y 


and give some applications of these formulas. 

Solution . From (1), 

cos z = 4- c - Ux+i v } ) 

= ^“^(cos .v + i sin x) + ^(cos x — i sin a*) 
= |( e v + e~ y ) cos x - \i(e y — e~ y ) sin a*. 

This yields (6a) since, as is known form calculus, 

(8) cosh y = \{e v + e~ y ) y sinh v = ^{e v — 
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EXAMPLE 2 


(6b) is obtained similarly. From (6a) and cosh 2 y = 14- sinh 2 y we obtain 

|cos z| 2 = ( cos 2 jc ) (I 4* sinh 2 y) 4 sin 2 x sinh 2 y. 

Since sin 2 a* 4 cos 2 a = 1, this gives (7a). and (7b) is obtained similarly. 

For instance, cos (2 4 3 i) = cos 2 cosh 3 - i sin 2 sinh 3 — -4.190 - 9.109/. 

From (6) we see that cos z and sin z are periodic with period 2tt, just as in real. Periodicity of tan z and 
cot z with period 7 r now follows. 

Fonnula (7) points to an essential difference between the real and the complex cosine and sine; whereas 
|cos *| = 1 and jsin x\ ^ 1, the complex cosine and sine functions are no longer bounded but approach infinity 

in absolute value as y -* since then sinh y in (7). ■ 

Solutions of Equations. Zeros of cos z and sin z 

Solve (a) cos z = 5 (which has no real solution!), (b) cos z = 0, (c) sin 2 = 0. 

Solution, (a) e 2iz — \0c jtz 4 1=0 from (1) by multiplication by e 12 . This is a quadratic equation in e* 2 , 
with solutions (rounded off to 3 decimals) 

e 2 = e~ y+ix = 5 ± V25 - 1 = 9.899 and 0.101. 

Thus e~ v = 9.899 or 0.101, /* = 1 j = ±2.292, a = Inn, Ans. z = ±2nir ± 2.292 / (n = 0, 1, 2. * • •). 

Can you obtain this from (6a)? 

(b) cos a = 0. sinh v = 0 by (7a), y = 0. Ans . z = ±^(2 n 4 1)tt (?i = 0, 1, 2, • • •)• 

(c) sin a = 0. sinh y = 0 by (7b). Ans. z = ±mr ( n =0, 1, 2, • • •). Hence the only zeros of cos z and 

sin z are those of the real cosine and sine functions. M 

General formulas for the real trigonometric functions continue to hold for complex 
values. This follows immediately from the definitions. We mention in particular the 
addition rules 

cos (zi ± z 2 ) = cos £i cos ^2 + s ^ n s ^ n *2 
sin (, zi ± z 2 ) “ sin Zi cos z 2 ± sin z 2 cos Z\ 

and the formula 

(10) cos 2 z + sin 2 z = 1. 

Some further useful formulas are included in the problem set. 


Hyperbolic Functions 

The complex hyperbolic cosine and sine are defined by the formulas 

(11) cosh z — 2 (e* 4- e~ z ), sinh z = \{e z — e~ z ). 

This is suggested by the familiar definitions for a real variable [see (8)]. These functions 
are entire, with derivatives 

(12) (cosh zY = sinh z , (sinh zY = cosh z, 

as in calculus. The other hyperbolic functions are defined by 
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cosh z 

coth z = — — , 
sinh z 

1 

csch z = ~~r ~ t — • 
smh z 

Complex Trigonometric and Hyperbolic Functions Are Related. If in (1 1), we replace 
- by iz and then use (1), we obtain 

(14) cosh iz — cos z, sinh iz — i sin z. 

Similarly, if in (1) we replace z by iz and then use (1 1), we obtain conversely 

(15) cos iz = cosh z, sin iz = / sinh z* 

Here we have another case of unrelated real functions that have related complex analogs, 
pointing again to the advantage of working in complex in order to get both a more unified 
formalism and a deeper understanding of special functions. This is one of the main reasons 
for the importance of complex analysis to the engineer and physicist. 



1. Prove that cos z, sin z, cosh z, sinh z. are entire 
functions. 

2. Verify by differentiation that Re cos z and Im sin z are 
harmonic. 

3-6] FORMULAS FOR HYPERBOLIC FUNCTIONS 

Show that 

3. cosh z = cosh a* cos y 4- / sinh x sin y 
sinh z = sinh x cos y 4 / cosh x sin y. 

4 . cosh (zi + z 2 ) = cosh z\ cosh z 2 + sinh Z\ sinh z 2 
sinh (zi + z 2 ) = sinh Zi cosh z 2 4 cosh Z\ sinh z 2 . 


14. sinh (4 — 3/) 15. cosh (4 - 67 ri) 

16. (Real and imaginary parts) Show that 


Re tan z 
Im tan z 


sin x cos x 
cos 2 x 4 sinh 2 y 

sinh y cosh y 
cos 2 x 4 sinh 2 y 


1 17-21 1 Equations. Find all solutions of the following 
equations. 

17. cosh z = 0 18. sin z = 100 

19. cos z = 2/ 20. cosh z = — 1 

21. sinh z = 0 


5. cosh 2 z — sinh 2 z = 1 

6. cosh 2 z 4 sinh 2 z = cosh 2z 

7-15 Function Values. Compute (in the form 11 4 iv) 

7. cos (1 4 /) 8. sin (1 4 /) 

9. sin 5/, cos 5/ 10. cos 3 iri 

11. cosh (-2 4 3/), cos (-3 - 2/ ) 

12. — / sinh ( — it 4 20, sin (2 4 777 ) 

13. cosh (2/z 4 l)7n\ n = l, 2. • • • 


22. Find all z for which (a) cos z, (b) sin z has real values. 

23-25 Equations and Inequalities. Using the 
definitions, prove: 

23. cosz is even, cos(— z) = cosz, and sinz is odd, 
sin (— z) = —sin z. 

24. (sinh y| ^ (cos z| ^ cosh y, |sinh y| ^ |sin z| = cosh y. 
Conclude tliat the complex cosine and sine are not 
bounded in the whole complex plane. 

25. sin Zi cos z 2 = |[sin (z x + z 2 ) + sin {z x - z 2 )} 
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13.7 Logarithm. General Power 

We finally introduce the complex logarithm, which is more complicated than the real 
logarithm (which it includes as a special case) and historically puzzled mathematicians 
for some time (so if you first get puzzled — which need not happen! — be patient and work 
through this section with extra care). 

The natural logarithm of z = x 4- iy is denoted by In z (sometimes also by log z ) and 
is defined as the inverse of the exponential function; that is, w = In z is defined for 
z i 2 0 by the relation 


(Note that z = 0 is impossible, since e w # 0 for all w; see Sec. 13 . 5 .) If we set 
w = u + iv and z = re t(i , this becomes 

= e u * iv = re i0 . 

Now from Sec. 13.5 we know that e u+tv has the absolute value e u and the argument v. 
These must be equal to the absolute value and argument on the right: 

e u = r, v = A 

e u = r gives u = In r, where In r is the familiar real natural logarithm of the positive 
number r = |z|. Hence w = u + iv = In z is given by 

(1) In z = In /■ 4- i$ (r = \z\ > 0, 0 = arg z). 

Now comes an important point (without analog in real calculus). Since the argument of 
z is determined only up to integer multiples of 27 r, the complex natural logarithm In z 
(z # 0) is infinitely many-valued . 

The value of In z corresponding to the principal value Arg z (see Sec. 13 . 2 ) is denoted 
by Ln z (Ln with capital L) and is called the principal value of In z. Thus 

(2) Ln z = In |z| + i Arg z (z # 0). 

The uniqueness of Arg z for given z (# 0) implies that Ln z is single- valued, that is, a 
function in the usual sense. Since the other values of arg z differ by integer multiples of 
27 t, the other values of ln z are given by 

(3) In z = Ln z ± 2niri (n =1,2,** •)• 

They all have the same real part, and their imaginary parts differ by integer multiples of 27 r. 

If z is positive real, then Arg z = 0, and Ln z becomes identical with the real natural 
logarithm known from calculus. If z is negative real (so that the natural logarithm of 
calculus is not defined!), then Arg z = 7r and 


Ln z = In |z| + 777 


(z negative real). 
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EXAMPLE 1 


EXAMPLE 2 


From (1) and e in r = r for positive real /* we obtain 
(4a) e ln z = z 

as expected, but since arg ( e z ) = y ± 2mr is multivalued, so is 
(4b) In (e z ) = z ± 2nm, n = 0, 1, • • • . 


Natural Logarithm. Principal Value 

In 1 =0, ±27 iri, ±47 r/. • * * 

In 4 = 1.386 294 ± 2nm 

In (—1) = ±77/. ±377/, ±5 77/, • • • 

In (—4) = 1.386 294 ± (2 // + 1)77/ 

In / = 77/72. — 3 t7/2. 577/72, • • • 

In 4/ = 1.386 294 + 77/72 ± 2 //tt/ 

In (—4/) = 1.386 294 — mil ± Iniri 
In (3 - 4i) = In 5 4- / arg (3 — 4 i) 

= 1.609 438 - 0.927 295/ ± 2m n 


Ln 1 = 0 
Ln 4 = 1.386 294 
Ln (— 1 ) = 77 / 

Ln (—4) = 1.386294 + m 
Ln / = 77/72 

Ln 4/ = 1.386 294 + m/2 
Ln (—4/) = 1.386 294 — m/2 
Ln (3 - 40 = 1.609 438 - 0.927 295/ 
(Fig. 334) 



Fig. 334. Some values of In (3 — 4/) in Example 1 


The familiar relations for the natural logarithm continue to hold for complex values, 
that is, 

(5) (a) In (Z 1 Z 2 ) = In h + In z 2 > (b) In (ii/z 2 ) — In Zi - ln z 2 

but these relations are to be understood in the sense that each value of one side is also 
contained among the values of the other side; see the next example. 

illustration of the Functional Relation (5) in Complex 

Let 

-l ~ -2 “ e ~ “L 

If we take the principal values 

Ln n = Ln z 2 = m, 

then (5a) holds provided we write In (< 1 ^ 2 ) = in I = 2m; however, it is not true for the principal value, 
Ln (zj£ 2 ) = Ln 1 = 0. ■ 
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Analyticity of the Logarithm 

For every n = 0, ±1, ±2, • ■ • formula (3) defines a function, which is analytic , 
except at 0 and on the negative real axis ; and has the derivative 

, 1 

(6) (In z) = — (z not 0 or negative real). 


PROOF We show that the Cauchy-Riemann equations are satisfied. From (l)-(3) we have 


In z = In /* 4- i(0 4- c) = — In (a 2 + y 2 ) 4- /^arctan — 4- cj 


where the constant c is a multiple of 2ir. By differentiation, 


1 


1 




x 2 4- y 2 


= v„ = 


1 + (y/xf x 


1 


lL y ,.2 


x 4, + y 


= -v- = - 


1 4- (y/xf 


(-*)■ 


Hence the Cauchy-Riemann equations hold. [Confirm this by using these equations in 
polar form, which we did not use since we proved them only in the problems (to 
Sec. 13.4).] Formula (4) in Sec. 13.4 now gives (6), 


(In z) u x + iv x ^.2 _j_ y 2 1 


1 1 + (y/xf ( a - 2 ) 


x - ty 
x 2 + y 2 


Each of the infinitely many functions in (3) is called a branch of the logarithm. The 
negative real axis is known as a branch cut and is usually graphed as shown in Fig. 335. 
The branch for n = 0 is called the principal branch of In z. 


Fig. 335. Branch cut for In z 


General Powers 

General powers of a complex number z = x + iy are defined by the formula 

(7) z c = e c In 2 (c complex, z * 0). 

Since In z is infinitely many-valued, z c will, in general, be multivalued. The particular 
value 

= e cLnz 


is called the principal value of if. 
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If c = n = 1, 2, • • • , then z n is single- valued and identical with the usual nth power 
of z. If c = — l, —2, • • • , the situation is similar. 

If c = 1/n, where n = 2, 3, • • • , then 

= ^ = (z*0), 

the exponent is determined up to multiples of 2m/n and we obtain the n distinct values 
of the nth root, in agreement with the result in Sec. 13.2. If c = ptq , the quotient of two 
positive integers, the situation is similar, and z c has only finitely many distinct values. 
However, if c is real irrational or genuinely complex, then z c is infinitely many-valued. 

EXAMPLE 3 General Power 

i x = e l ln 1 = exp (/ In /) = exp ^/ i ± 2mrij J = e -(.rrf 2 )~ 2 mr 

All these values are real, and the principal value (n = 0) is e~ n/2 . 

Similarly, by direct calculation and multiplying out in the exponent, 

(1 + 0 2_1 = exp [(2 — i) In (1 + /)] = exp [(2 - /) {In V2 4 J-tt/ ± 2niri)] 

= 2e w/4:!:2n7r [sin In 2) -1- / cos (§ In 2)]. ■ 

It is a convention that for real positive z = x the expression z c means e c ln x where In x 
is the elementary real natural logarithm (that is, the principal value Ln z (z = x > 0) in 
the sense of our definition). Also, if z = e, the base of the natural logarithm, z c = e c is 
conventionally regarded as the unique value obtained from (1) in Sec. 13.5. 

From (7) we see that for any complex number a , 

(8) a z = e zlna . 

We have now introduced the complex functions needed in practical work, some of them 
(e z y cos z, sin z, cosh z, sinh z) entire (Sec. 13.5), some of them (tan z, cot z, tanh z, coth z) 
analytic except at certain points, and one of them (In z) splitting up into infinitely many 
functions, each analytic except at 0 and on the negative real axis. 

For the inverse trigonometric and hyperbolic functions see the problem set. 



1 1-9 1 Principal Value Ln z. Find Ln z when z equals: 

1 . -10 2 . 2 + 2 / 

3. 2 - 2\ 4. -5 ± 0.1/ 

5. -3-4/ 6.-100 

7. 0.6 4- 0.8/ 8. — ei 

9. 1 - / 


12. ln * 13. In (-6) 

14. ln (4 + 30 15. In (-£?“*) 

16. \n(e 3i ) 

17. Show that the set of values of ln (/ 2 ) differs from the 
set of values of 2 ln /. 


10-16 1 All Values of ln z. Find all values and graph 
some of them in the complex plane. 

10. lnl 11. ln(-l) 


18-21 


Equations. Solve for z: 


18. lnz = (2 — |/)tt 19. ln z = 0.3 + 0.7/ 

20. ln z = e — 7ri 21. ln z = 2 + 
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General Powers. Showing the details of your 
work, find the principal value of: 

22. / 2 \ (2/) i 23. 4 3+ * 

24. (1 - /) 1+i 25. (1 4- /) 1 ~ i 

26. (-I) 1 " 2 * 27. i m 

28. (3 - 4/) 1/3 

29. How can you find the answer to Prob. 24 from the 
answer to Prob. 25? 

30. TEAM PROJECT. Inverse Trigonometric and 
Hyperbolic Functions. By definition, the inverse sine 
w = arcsin z is the relation such that sin w = z . The 
inverse cosine w = arccos z is the relation such that 
cos w = z. The inverse tangent, inverse cotangent, 
inverse hyperbolic sine, etc., are defined and denoted 
in a similar fashion. (Note that all these relations are 
multivalued .) Using sin w = (e tw — e“ zw )f(2i) and 
similar representations of cos u\ etc., show that 


(a) arccos z — — / In (z + Vz 2 — 1) 

(b) arcsin z - —i In (/z 4 V 1 - z 2 ) 

(c) arccosh z — In (z 4 Vz 2 — 1) 

(d) arcsinh z = In (z 4 Vz 2 4 1) 

/ / 4* z 

(e) arctan z = — In 

2 i - z 

1 1 4 z 

(f) arctanh z = — In - 

(g) Show that w = arcsin z is infinitely many- valued, 

and if w x is one of these values, the others are of the 
form iv! ± 2n7r and tt — ± 2rnr, n = 0, 

(The principal value of vv = u 4 iv — arcsin z is 
defined to be the value for which — tt/2 ^ u ^ tt/ 2 
if v ^ 0 and — tt!2 < u < - tt /2 if v < 0 .) 


JEHAEZEER 13=ftE V 1 E W-Q ITE S T I O N S AND PROBLEMS 


1. Add, subtract, multiply, and divide 26 - 7/ and 
3 4- 4/ as well as their complex conjugates. 

2. Write the two given numbers in Prob. 1 in polar form. 
Find the principal value of their arguments. 

3. What is the triangle inequality? Its geometric meaning? 
Its significance? 

4. If you know the values of ^T, how do you get from 
them the values of ^ Vz for any z? 

5. State the definition of the derivative from memory. It 
looks similar to that in calculus. But what is the big 
difference? 

6. What is an analytic function? How would you test for 
analyticity? 

7. Can a function be differentiable at a point without being 
analytic there? If yes, give an example. 

8. Are |z|, z, Re z, Im z analytic? Give reason. 

9. State the definitions of e 2 , cos z, sin z, cosh z, sinh z and 
the relations between these functions. Do these relations 
have analogs in real? 

10. What properties of e z are similar to those of e x ? Which 
one is different? 

11. What is the fundamental region of e z ? Its significance? 

12. What is an entire function? Give examples. 

13. Why is In z much more complicated than In a? Explain 
from memory. 

14. What is the principal value of In z? 

15. How is the general power z c defined? Give examples. 


16-21 1 Complex Numbers. Find, in the form x + /y, 
showing the details: 

16. (1 + /)“ IV- (-2 + 6 if 

18. 1/(3 - 7 i) 19. (1 - /)/(l + if 

20. V— 5 - 12/ 21. (43 - 190/(8 + 0 


22-26 1 Polar Form. Represent in polar form, with the 
principal argument: 

22. I - 3/ 23. -6 + 6/ 

24. V20/(4 + 20 25.-12/ 

26. 2 + 2/ 

Roots. Find and graph all values of 


27-30 


27. VSi 
29. 


28. V 256 


30. V32 - 24/ 


31-35 


Analytic Functions. Find /(z) = h(a\ v) 4 iv(x,y) 
with u or v as given. Check for analyticity. 

31. u = a*/(a* 2 4 y 2 ) 32. v = e~ 3x sin 3 y 

33. u = x 2 - 2xy - y 2 34. it = cos 2x cosh 2y 
35. v = e* 2 -» 2 sin 2xv 


36-39 


Harmonic Functions. Are the following 
functions harmonic? If so, find a harmonic conjugate. 

36. x 2 y 2 37. xy 

38. e~ xi2 cos|y 39. a* 2 + v 2 


40-45 


Special Function Values. Find the values of 
40. sin (3 4 4'7n‘) 41. sinh47n 

42. cos (5 t r + 2 i) 43. Ln (0.8 4- 0.6i) 

44. tan ( I 4 /) 45. cosh (1 4 rri) 
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For arithmetic operations with complex numbers 

(1) z = x + iy = re 10 = r(cos 0 + / sin 0), 

r = |z| = Va 2 + y 2 , 0 = arctan (y/A), and for their representation in the complex 
plane, see Secs. 13.1 and 13.2. 

A complex function f(z ) = u( a, y) + iv(x, y) is analytic in a domain D if it has 
a derivative (Sec. 13.3) 


( 2 ) 


f\z) = lim 

lz — *-0 


fix + A z) ~ /fe) 
Az 


everywhere in D. Also, /(z) is analytic at a point z = 4 o if it has a derivative in a 
neighborhood of Zo (not merely at z 0 itself). 

If /(z) is analytic in D, then m(a, y) and u(a, satisfy the (very important!) 
Cauchy-Riemann equations (Sec. 13.4) 

du _ du _ dt; 

~dx ~ ~dy ' ay “ ” a a 

everywhere in D. Then w and y also satisfy Laplace’s equation 

(4) U XX H“ Uyy — 0, V XX "f Vyy ~ 0 

everywhere in £). If m(a, y) and y(A, y) are continuous and have continuous partial 
derivatives in D that satisfy (3) in D, then /(z) = u(x, y) + iv(x, y) is analytic in 
D. See Sec. 13.4. (More on Laplace’s equation and complex analysis follows in 
Chap. 18.) 

The complex exponential function (Sec. 13.5) 

(5) e z = exp z = e x (cos y -I- i sin y) 

reduces to e x if z = x (y = 0). It is periodic with liri and has the derivative e z . 
The trigonometric functions are (Sec. 13.6) 

cos z = — ( e tz 4- e~ u ) = cos x cosh y - i sin x sinh y 

(6) \ 

sin z = — (e tz — e" 1 *) — sin a coshy 4 i cos x sinh y 
and, furthermore, 

tan z = (sin z)/cos z, cot z = 1/tan z, etc. 
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The hyperbolic functions are (Sec. 13.6) 

(7) cosh z = + e~ z ) = cos iz, sinh z = — e~ z ) = — / sin iz 

etc. The functions (5)-(7) are entire, that is, analytic everywhere in the complex 
plane. 

The natural logarithm is (Sec. 13.7) 

(8) In z — In |z| + i arg z = In |z| + i Arg z ± 2nm 

where z # 0 and n = 0, 1, • • • . Arg z is the principal value of argz, that is, 
— tt < Arg ziir, We see that In z is infinitely many-valued. Taking n — 0 gives 
the principal value Ln z of In z; thus Ln z = In |z| + i Arg z. 

General powers are defined by (Sec. 13.7) 

(9) z c = e clnz ( c complex, z # 0). 





CHAPTER I 4 
Complex Integration 


Two main reasons account for the importance of integration in the complex plane. The 
practical reason is that complex integration can evaluate certain real integrals appearing 
in applications that are not accessible by real integral calculus. The theoretical reason is 
that some basic properties of analytic functions are difficult to prove by other methods. 
A striking property of this type is the existence of higher derivatives of an analytic function. 

Complex integration also plays a role in connection with special functions, such as the 
gamma function (see [GR1], p. 255), the error function, various polynomials (see [GR10]) 
and others, and the application of these functions in physics. 

In this chapter we define and explain complex integrals. The most important result in 
the chapter is Cauchy’s integral theorem or the Cauchy-Goursat theorem, as it is also 
called (Sec. 14.2). It implies Cauchy’s integral formula (Sec. 14.3), which in turn implies 
the existence of all higher derivatives of an analytic function. Hence in this respect, 
complex analytic functions behave much more simply than real-valued functions of real 
variables, which may have derivatives only up to a certain order. 

A further method of complex integration, known as integration by residues, and its 
application to real integrals will need complex series and follows in Chap. 16. 

Prerequisite: Chap. 13 

References and Answers to Problems : App. 1 Part D, App. 2. 


14.1 Line Integral in the Complex Plane 

As in calculus we distinguish between definite integrals and indefinite integrals or 
antiderivatives. An indefinite integral is a function whose derivative equals a given 
analytic function in a region. By inverting known differentiation formulas we may find 
many types of indefinite integrals. 

Complex definite integrals are called (complex) line integrals. They are written 

f f(z) dz . 

J c 

Here the integrand f(z ) is integrated over a given curve C or a portion of it (an arc, but 
we shall say “curve” in either case, for simplicity). This curve C in the complex plane is 
called the path of integration. We may represent C by a parametric representation 

(1) z(t) = x{t) + iy{t) (a ^t^b). 
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The sense of increasing t is called the positive sense on C, and we say that C is oriented 
by (1). 

For instance, z(f) - t + 3/f (0 ^ ^ 2) gives a portion (a segment) of the line y = 3x. 
The function z(t) = 4 cos f + 4/ sin t (- 7r ^ ^ 7r) represents the circle \z\ = 4, and so 
on. More examples follow below. 

We assume C to be a smooth curve, that is, C has a continuous and nonzero derivative 

s(0 = ^ = i(0 + oKO 


at each point. Geometrically this means that C has everywhere a continuously turning 
tangent, as follows directly from the definition 


z(t) = lim 


;(r + A Q - 2(0 

At 


(Fig. 336). 


Here we use a dot since a prime ' denotes the derivative with respect to z. 


Definition of the Complex Line Integral 

This is similar to the method in calculus. Let C be a smooth curve in the complex plane 
given by (1), and let f(z) be a continuous function given (at least) at each point of C. We 
now subdivide (we “partition”) the interval a tk t ^ b in (1) by points 

to (= ci\ h, • , r n _ x , t n (= b) 

where / 0 < h < * ’ * < f rr To this subdivision there corresponds a subdivision of C by 
points 

Zo> Z\ 9 ’ ’ • , z n _ 1 , Zn(= Z) (Fig. 337), 



Fig. 336. Tangent vector z(t) of a curve C in the 
complex plane given by z(t). The arrowhead on the 
curve indicates the positive sense (sense of increasing t). 



where Zj — z(tj). On each portion of subdivision of C we choose an arbitrary point, say, 
a point between Zq and z x (that is, & = z(t) where t satisfies = * = *i)» a point £ 2 
between z x and z 2i etc. Then we form the sum 

n 

(2) S n 2) f(£m) where Z??i Z?n— 1* 

m=l 

We do this for each n = 2, 3, • • • in a completely independent manner, but so that the 
greatest |Ar ?n | = \t m - / m _ x | approaches zero as n -» <*. This implies that the greatest 
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\Azm\ also approaches zero. Indeed, it cannot exceed the length of the arc of C from z m _ x 
to Ztn and *h e latter goes to zero since the arc length of the smooth curve C is a continuous 
function of t. The limit of the sequence of complex numbers S 2 , S 3 , * * ■ thus obtained is 
called the line integral (or simply the integral) of f(z ) over the path of integration C with 
the orientation given by (1). This line integral is denoted by 

(3) J f(z) dz, or by f f(z) dz 

c c 


if C is a closed path (one whose terminal point Z coincides with its initial point z 0 , as for 
a circle or for a curve shaped like an 8). 

General Assumption. All paths of integration for complex line integrals are assumed to 
be piecewise smooth, that is, they consist of finitely many smooth cu/yes joined end to end. 


Basic Properties Directly Implied by the Definition 

1. Linearity. Integration is a linear operation, that is, we can integrate sums term by 
term and can take out constant factors from under the integral sign. This means that 
if the integrals of f 1 and f 2 over a path C exist, so does the integral of k^fi + k 2 f 2 
over the same path and 


(4) 


/ [*l/l(2) + Wz)] dz = k 1 J h(z) dz + k 2 f f 2 (z) dz. 


2. Sense reversal in integrating over the same path, front z 0 to Z (left) and front Z to 
j 0 (right), introduces a minus sign as shown. 


(5) 


f f(z) dz = -f °f(z) dz. 

J Zo J Z 


3. Partitioning of path (see Fig. 338) 


( 6 ) 


J f(z) dz = / f(z)dz + J f(z)dz. 

C Cj O2 


C, 



Fig. 338. Partitioning of path [formula (6)] 


Existence of the Complex Line Integral 

Our assumptions that f(z ) is continuous and C is piecewise smooth imply the existence 
of the line integral (3). This can be seen as follows. 

As in the preceding chapter let us write f(z) = u(x, y) -f iv(x. y). We also set 


Cm Cm 4 " it}ni 


and 


4 " i Ay m , 
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Then (2) may be written 

(7) s n = 2 (« + io)(Ax m + i Ay m ) 


where u = w(f m , i? m ), v = u(f m , ^ w ) and we sum 0VQr m from 1 to n. Performing the 
multiplication, we may now split up S n into four sums: 

s„ = X « A.v m - 2 V A y m + / [E » Ay m + Aa to ] . 

These sums are real. Since / is continuous, « and u are continuous. Hence, if we let n 
approach infinity in the aforementioned way, then the greatest Ax m and A y m will approach 
zero and each sum on the right becomes a real line integral: 

(8) Lim S n = \ f(z ) dz— \ u dx — f v dy + i f \ u dy + [ v dx 

n—»oc J c Jq J c | •'C 

This shows that under our assumptions on f and C the line integral (3) exists and its value 
is independent of the choice of subdivisions and intermediate points £ m . ■ 

First Evaluation Method: 

Indefinite Integration and Substitution of Limits 

This method is the analog of the evaluation of definite integrals in calculus by the 
well-known formula 

r» 

J f(x) dx = F(b) - F(a) IF'ix) = fix)]. 

CL 



It is simpler than the next method, but it is suitable for analytic functions only. To formulate 
it, we need the following concept of general interest. 

A domain D is called simply connected if every simple closed curve (closed curve 
without self-intersections) encloses only points of D. 

For instance, a circular disk is simply connected, whereas an annulus (Sec. 13.3) is not 
simply connected. (Explain!) 


THEOREM 1 


Indefinite Integration of Analytic Functions 

Let f(z) be analytic in a simply connected domain D. Then there exists an 
indefinite integral of f{z) in the domain D, that is, an analytic function F(z) such that 
F'(z) = f(z) in D, and for all paths in D joining two points Zq and Z\ in D we have 

(9) I fiz) dz = Fi Zl ) ~ F{z 0 ) [F’iz) = /( 2 )]. 

20 

( Note that we can write z 0 cind z± instead of C, since we get the same value for all 
those C from Zq to z 2 .) 




SEC 14.1 Line Integral in the Complex Plane 


641 


This theorem will be proved in the next section. 

Simple connectedness is quite essential in Theorem 1, as we shall see in Example 5. 
Since analytic functions are our main concern, and since differentiation formulas will often 
help in finding F(z) for a given f(z) = F'(z), the present method is of great practical interest. 

If f(z ) is entire (Sec. 13.5), we can take for D the complex plane (which is certainly 
simply connected). 


EXAMPLE 1 f z 2 dz = \ z 3 
J 0 3 

r- 


EXAMPLE 2 


EXAMPLE 3 


EXAMPLE 4 


cos zdz — sin z 


s. 


— m 

,8— 3?ri 


1.3 22 
= T (' + ') = “ J + I ' 

i 

= 2 sin m = 2/ sinh tt — 23.097/ 


e zf2 dz = 2e zl2 


8 


8— 3t ri 


= 2(* 4 ~ 37r?V2 - e* +7ril2 ) = 0 


8+wi 


since e * is periodic with period 2m. H 

J dz nr ( m nr\ 

— = Ln / - Ln (-/) = — y — — j = iir. Here D is the complex plane without 0 and the negative 

real axis (where Ln z is not analytic). Obviously, D is a simply connected domain. I 

Second Evaluation Method: 

Use of a Representation of a Path 

This method is not restricted to analytic functions but applies to any continuous complex 
function. 


THEOREM 2 


Integration by the Use of the Path 

Let C be a piecewise smooth path, represented by z = z(t\ where a ^ t ^ b. Let 
f(z) be a continuous function on C. Then 


( 10 ) 


J f(z)dz = J f[z(t)]z(t) dt 


M)- 


PROOF The left side of (10) is given by (8) in terms of real line integrals, and we show that the 
right side of (10) also equals (8). We have z = x 4- iy, hence z = x + iy. We simply 
write u for w[x(f), y(t)] and v for v[x(t), y(/)]. We also have dx = x dt and dy = y dt . 
Consequently, in (10) 


J /U(0H(0 dt = J (u + iv)(x + iy) dt 

CL CL 

= [udx — v dy + i(u dy + v dx)] 

J c 

= J (udx — v dy) -b if (u dy + v dx). 
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EXAMPLE 5 


EXAMPLE 6 


COMMENT. In (7) and (8) of the existence proof of the complex line integral we referred 
to real line integrals. If one wants to avoid this, one can take (10) as a definition of the 
complex line integral. 

Steps in Applying Theorem 2 

(A) Represent the path C in the form z(f) b). 

(B) Calculate the derivative z(t) = dzldt. 

(C) Substitute z(t) for every z in f{z) (hence x(t) for x and y(t) for y). 

(D) Integrate /[z(/)]z(r) over t from a to b. 

A Basic Result: Integral of 1/z Around the Unit Circle 

We show that by integrating 1/z counterclockwise around the unit circle (the circle of radius I and center 0; see 
Sec. 13.3) we obtain 


(ii) 


r dz 

tT" 


2 m 


(C the unit circle, 
counterclockwise). 


This is a very important result that we shall need quite often. 

Solution . (A) We may represent the unit circle C in Fig. 327 of Sec. 13.3 by 

z{t) = cos t + / sin t = e n (0 ^ t ^ 27t). 


so that counterclockwise integration corresponds to an increase of t from 0 to 27 r. 

(B) Differentiation gives z0) = ie u (chain rule!). 

(C) By substitution. f(z(t)) = 1 lz(t) = e~ n . 

(D) From (10) we thus obtain the result 



<lt 


='j d 


(It = 2777. 


Check tliis result by using z{t) = cos t + i sin t. 

Simple connectedness is essential in Theorem L Equation (9) in Theorem 1 gives 0 for any closed path 
because then z\ = Zq* so that F{t\) — F(zq) = 0. Now 1/z is not analytic at z = 0. But any simply connected 
domain containing the unit circle must contain z ** 0, so that Theorem 1 does not apply — it is not enough that 
l/z is analytic in an annulus, say, | < \z\ < §, because an annulus is not simply connected! M 


Integral of l/z m with Integer Power m 

Let f(z) = (z — z 0 ) m where m is the integer and zq a constant. Integrate counterclockwise around the circle C 
of radius p with center at Zq (Fig. 339). 



Fig. 339. Path in Example 6 
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EXAMPLE 7 


Solution . Wc may represent C in the form 


Then we have 


and obtain 


z(i) = zo + p(cos t 4* i sin r) = zo + p<? 
(z - , o) ”» = p m e im \ dz = ipe u dt 


<f(z- z 0 ) m dz= f 

J r> J 


,2tt 

-it _ • m+1 
p e /pc t/f — /p 

0 


■C 


By the Euler formula (5) in Sec. 13.6 the right side equals 

,2ir 2?r 


ip m+1 f 
_ J 0 


cos (m + 1 )tdt + i I sin (/n + 1)7 tf/ 
J o 


(0 ^ 7 ^ 2tt). 


If 77i = — I, we have p m+1 = 1, cos 0 = 1, sin 0 = 0. We thus obtain 2 tt/. For integer m 7= 1 each of the two 
integrals is zero because we integrate over an interval of length 2 t t, equal to a period of sine and cosine. Hence 
the result is 


( 12 ) 


f (2777 

f (z-z 0 ) m dz= A 

J c 1.0 


(m = - 1 ), 

(tti =£ —1 and integer). 


Dependence on path. Now comes a very important fact. If we integrate a given function 
f(z ) from a point z 0 to a point Zi along different paths, the integrals will in general have 
different values. In other words, a complex line integral depends not only on the endpoints 
of the path but in general also on the path itself The next example gives a first impression 
of this, and a systematic discussion follows in the next section. 

Integral of a Nonanalytic Function. Dependence on Path 

Integrate f(z) = Re z = x from 0 to i + 2/ (a) along C* in Fig. 340, (b) along C consisting of C x and C 2 . 


2 |- o 2 = 1 + 2/ 


cy 


/ n 

/ c i 


l .v 

Fig. 340. Paths in Example 7 


Solution . (a) C* can be represented by z{t) - t + 2/7 (0 % 7 ^ I). Hence z(t) =1+2/ and f[z(0] — -*(r) = 7 
on C*. We now calculate 


(b) We now have 


Re z dz 
c* 



7(1 + 2i)dr = ^(1 +2/) = \ 


+ /. 


Cl- Z(t ) = /, Z(t) = 1, /(£(/)) = x(t) = / (0 ^ ^ I) 

C 2 : z(0 = 1 + if, z(/) = 1, /(z(/)) = A‘(/) = I (0^7^ 2). 
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Using ( 6 ) we calculate 

I Re z dz = I Re z dz + I Re c dz = I t dt + I 1 • / dt = 3 + 2i. 
Jr* Jr* Jr* J n J 


Note that this result differs from the result in (a). 


Bounds for Integrals. ML-lnequality 

There will be a frequent need for estimating the absolute value of complex line integrals. 
The basic formula is 


(13) 




f(z)dz 


^ ML 


(M L-inequality); 


L is the length of C and M a constant such that |/(z)| ^ M everywhere on C. 


PROOF Taking the absolute value in (2) and applying the generalized inequality (6*) in Sec. 13.2, 
we obtain 


\Sn\ = 


2 fUm) AZro 

m= 1 


n 


n 


^2 |Az„J. 


Now lA^I is the length of the chord whose endpoints are Zm-i and z m (see Fig. 337 on 
p. 638). Hence the sum on the right represents the length L* of the broken line of chords 
whose endpoints are Zo> * * * > Zn (= Z). If n approaches infinity in such a way that the 
greatest |Af w | and thus |A zj\ approach zero, then L* approaches the length L of the curve 
C, by the definition of the length of a curve. From this the inequality (13) follows. ■ 


We cannot see from (13) how close to the bound ML the actual absolute value of the 
integral is, but this will be no handicap in applying (13). For the time being we explain 
the practical use of (13) by a simple example. 


EXAMPLE 8 Estimation of an Integral 



Fig. 341. Path In 
Example 8 


Find an upper bound for the absolute value of the integral 

I z 2 dz, C the straight-line segment from 0 to 1 + u Fig. 341. 

J c 

Solution . L = V 2 and |/(z)| = \z 2 \ ^ 2 on C gives by (13) 


IW s 


2V2 = 2.8284. 


2 2 

The absolute value of the integral is — — + — i 


= — V2 = 0.9428 (see Example I). 


Summary on Integration. Line integrals of f(z) can always be evaluated by (10), using 
a representation (1) of the path of integration. If f(z) is analytic, indefinite integration by 
(9) as in calculus will be simpler. 
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PROBLEM SET 14.1 


1-9 


PARAMETRIC REPRESENTATIONS 


Find and sketch the path and its orientation given by: 


1. z(t) = (1 + 3/)/(l £ / £ 4) 


2. z(t) = 5 - 2/7 (- 3 ^ ^ 3) 

3. z(t) = 44-/ + 3<? iJ (0 ^ / £ 2 tt) 

4. 2« = 1 + / + e“ vit (0^/32) 

5. 2(/) = * i£ (0 ^ ^ tt) 

6. zit) = 3 + 4 / + 5* ir (t r ^ ^ 2 tt) 


7. c(/) = 6 cos 2/ + 5/ sin 2/ (0 ^ t ^ 7r) 

8. s(/) = I + 2r + 8/7 2 (-15/^ 1) 

9. z(0 = / + i/7 3 (-1 ^ ^ 2) 


10-18 


PARAMETRIC REPRESENTATIONS 


Sketch and represent parametrically: 

10. Segment from 1 + / to 4 - 2/ 

11. Unit circle (clockwise) 

12. Segment from a + ib to c + id 

13. Hyperbola at = 1 from I + / to 4 + £/' 

14. Semi-ellipse a 2 /<i 2 + y 2 fb 2 = 1 , — 0 

15. Parabola y = 4 - 4a* 2 (-1 ^ x ^ I ) 

16. \z - 2 + 3/| = 4 (counterclockwise) 

17. \z + a + ib\ = r (clockwise) 

18. Ellipse 4(x - l) 2 + 9(y + 2) 2 = 36 


1 1 9-29 1 INTEGRATION 

Integrate by the First method or state why it does not apply 
and then use the second method. (Show the details of your 
work.) 


19. I Re 2 dz. C the shortest path from 0 to I + / 

J c 

20. I Re 2 dz, C the parabola y = a 2 from 0 to 1 + / 

J c 

21. I e 22 dz , C the shortest path from m to 2iri 
J c 

22. I sin z dz, C any path from 0 to 2/ 

J c 

23. I cos 2 2 dz from -iri along \z\ = v to m in the right 

J r> 


J C 

half-plane 


24. I (2 + 2 ! ) dz, C the unit circle (counterclockwise) 
J c 

25. I cosh 4zdz* C any path from — 7T//8 to tt/78 


26. I z dz, C from — 1 + / along the parabola y = .v 2 to 
J c 

1 + / 

27. I sec 2 2 dz, C any path from 7 t/ 4 to 7T//4 
J c 

28. j Im 2 2 dz counterclockwise around the triangle with 
J c 

vertices 2 = 0. I , / 

29. I ze^ n dz . C from / along the axes to 1 
J c 

30. (Sense reversal) Verify (5) for /(2) = 2 2 . where C is 
the segment from — 1 — / to 1 + /. 

31. (Path partitioning) Verify (6) for f(z) = 1/2 and C x 
and C 2 the upper and lower halfs of the unit circle 

32. (AfL-inequality) Find an upper bound of the absolute 
value of the integral in Prob. 19. 

33. (Linearity) Illustrate (4) with an example of your own. 
Prove (4). 

34. TEAM PROJECT. Integration, (a) Comparison. 
Write a short report comparing the essential points of 
the two integration methods. 

(b) Comparison. Evaluate I f{z)dz by Theorem 1 

J c 

and check the result by Theorem 2, where: 

(i) f(z) — z 4 and C is the semicircle | 2 | = 2 from 
—2/ to 2/ in the right half-plane. 

(ii) /(2) = e 2 * and C is the shortest path from 0 
to 1 + 2/. 

(c) Continuous deformation of path. Experiment 
with a family of paths with common endpoints, say, 
z(t) = / + iVi sin /, 0 ^ t ^ 7T. with real parameter a. 
Integrate nonanalytic functions (Re 2 , Rc (2 2 ), etc.) and 
explore how the result depends on a. Then take analytic 
functions of your choice. (Show the details of your 
work.) Compare and comment. 

(d) Continuous deformation of path. Choose 
another family, for example, semi-ellipses 
z(t) = a cost + /sinr, —tt12 ^ / %. tt/ 2, and 
experiment as in (c). 

35. CAS PROJECT. Integration. Write programs for the 
two integration methods. Apply them to problems of 
your choice. Could you make them into a joint program 
that also decides which of the two methods to use in a 
given case? 
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14 .j Cauchy's Integral Theorem 

We have just seen in Sec. 14.1 that a line integral of a function f(z) generally depends 
not merely on the endpoints of the path, but also on the choice of the path itself. This 
dependence often complicates situations. Hence conditions under which this does not 
occur are of considerable importance. Namely, if f(z) is analytic in a domain D and D is 
simply connected (see Sec. 14.1 and also below), then the integral will not depend on the 
choice of a path between given points. This result (Theorem 2) follows from Cauchy’s 
integral theorem, along with other basic consequences that make Cauchy's integral 
theorem the most important theorem in this chapter and fundamental throughout complex 
analysis. 

Let us begin by repeating and illustrating the definition of simple connectedness 
(Sec. 14.1) and adding some more details. 

1. A simple closed path is a closed path (Sec. 14.1) that does not intersect or touch 
itself (Fig. 342). For example, a circle is simple, but a curve shaped like an 8 is not 
simple. 



Simple 




Simple Not simple 

Fig. 342. Closed paths 



2. A simply connected domain D in the complex plane is a domain (Sec. 13.3) such 
that every simple closed path in D encloses only points of D. Examples: The interior 
of a circle (“open disk”), ellipse, or any simple closed curve. A domain that is not 
simply connected is called multiply connected. Examples: An annulus (Sec. 13.3), 
a disk without the center, for example, 0 < |z| < 1. See also Fig. 343. 



Simply 

connected 


Simply 

connected 


Doubly 

connected 


Triply 

connected 


Fig. 343. Simply and multiply connected domains 


More precisely, a bounded domain D (that is. a domain that lies entirely in some circle about the origin) is 
called />-fo!d connected if its boundary consists of/; closed connected sets without common points. These sets 
can be curves, segments* or single points (such as z = 0 for 0 < \z\ < 1, for which p = 2). Thus. D has p - I 
“holes”, where “hole” may also mean a segment or even a single point. Hence an annulus is doubly connected 
(P = 2). 
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THEOREM 1 


EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


EXAMPLE 4 


Cauchy’s Integral Theorem 


Iff(z) is analytic in a simply connected domain D, then for every simple closed path 

C in D , 


(i) 

<f> f(z) dz = 0. See Fig. 344. 

J c 



Fig. 344. Cauchy’s integral theorem 


Before we prove the theorem, let us consider some examples in order to really understand 
what is going on. A simple closed path is sometimes called a contour and an integral over 
such a path a contour integral. Thus, (1) and our examples involve contour integrals. 

No Singularities (Entire Functions) 

e 2 dz = 0, cos z dz = 0, z n dz = 0 (#i = 0, 1, • • •) 

for any closed path, since these functions are entire (analytic for all z). I 


Singularities Outside the Contour 



dz 


0 . 



= 0 


where C is the unit circle, sec z = 1/cos z is not analytic at z — ±7t/2, ±3tt/2 , • • • , but all these points lie 
outside C; none lies on C or inside C. Similarly for the second integral, whose integrand is not analytic at 
z = ±2/ outside C. ■ 


Nonanalytic Function 



dt = 2 m 


where C: z(t) = c lt is the unit circle. This does not contradict Cauchy’s theorem because f(z ) = z is not 
analytic. ■ 


Analyticity Sufficient, Not Necessary 



where C is the unit circle. This result does not follow from Cauchy’s theorem, because f(z) = l/z 2 is not analytic 
at z = 0. Hence the condition that f be analytic in D is sufficient rather than necessary for (1) to be true. M 
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EXAMPLE 5 


PROOF 


Simple Connectedness Essential 



for counterclockwise integration around the unit circle (see Sec. 14.1). C lies in the annulus ^ < |z| < § where 
Uz is analytic, but this domain is not simply connected, so that Cauchy’s theorem cannot be applied. Hence the 
condition that the domain D be simply connected is essential 
In other words, by Cauchy’s theorem, if /(-) is analytic on a simple closed path C and everywhere inside C, 
with no exception, not even a single point, then (1) holds. The point that causes trouble here is z = 0 where Mz 
is not analytic. ■ 

Cauchy proved his integral theorem under the additional assumption that the derivative 
f\z) is continuous (which is true, but would need an extra proof). His proof proceeds as 
follows. From (8) in Sec. 14.1 we have 

<p f(z) dz — y (it clx — v dy) H- / <p (a dy 4- v dx). 

J c J c J c 

Since f(z) is analytic in £>, its derivative f'(z) exists in D. Since f'(z) is assumed to be 
continuous, (4) and (5) in Sec. 13.4 imply that u and u have continuous partial derivatives 
in D. Hence Green’s theorem (Sec. 10.4) (with it and — v instead of F 1 and F z ) is applicable 
and gives 



where R is the region bounded by C. The second Cauchy-Riemann equation (Sec. 13.4) 
shows that the integrand on the right is identically zero. Hence the integral on the left is 
zero. In the same fashion it follows by the use of the first Cauchy-Riemann equation that 
the last integral in the above formula is zero. This completes Cauchy’s proof. ■ 

Goursat’s proof without the condition that f'(z) is continuous 1 is much more 
complicated. We leave it optional and include it in App. 4. 

Independence of Path 

We know from the preceding section that the value of a line integral of a given function 
f(z) from a point zi to a point z * 2 will in general depend on the path C over which we 
integrate, not merely on Z\ and z 2 - It is important to characterize situations in which this 
difficulty of path dependence does not occur. This task suggests the following concept. 
We call an integral of f(z) independent of path in a domain D if for every z%, Z 2 in D 
its value depends (besides on f(z), of course) only on the initial point Z\ and the terminal 
point Z 2 > hut not on the choice of the path C in D [so that every path in D from Zi to z z 
gives the same value of the integral of f(z)]. 


l EDOUARD GOURSAT (1858-1936). French mathematician. Cauchy published the theorem in 1825. The 
removal of that condition by Goursat (see Transactions Amer. Math . Soc„ voi. 1. 1900) is quite important, for 
instance, in connection with the fact that derivatives of analytic functions are also analytic, as we shall prove 
soon. Goursat also made important contributions to PDEs. 
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THEOREM 2 


PROOF 


Independence of Path 

If f(z) is analytic in a simply connected domain D, then the integral of f(z) is 
independent of path in D . 


Let Z\ and z 2 be any points in D. Consider two paths C 1 and C 2 in D from z x to z 2 without 
further common points, as in Fig. 345. Denote by C 2 the path C 2 with the orientation 
reserved (Fig. 346). Integrate from Z\ over C x to z% and over C 2 back to z x . This is a 
simple closed path, and Cauchy’s theorem applies under our assumptions of the present 
theorem and gives zero: 

(2') [ fdz + [ fdz = 0, thus [/<&=-[ fdz. 

J Ci J c 1 J ct 

But the minus sign on the right disappears if we integrate in the reverse direction, from 
Z\ to z 2 , which shows that the integrals of f(z) over C x and C 2 are equal. 


( 2 ) 


J f{z ) dz = f f(z ) dz 
c, J c 2 


(Fig. 345). 


This proves the theorem for paths that have only the endpoints in common. For paths that 
have finitely many further common points, apply the present argument to each “loop” 
(portions of C x and C 2 between consecutive common points; four loops in Fig. 347). For 
paths with infinitely many common points we would need additional argumentation not 
to be presented here. ■ 



Fig. 345. Formula (2) Fig. 346. Formula (2 ; ) 


Fig. 347. Paths with more 
common points 


Principle of Deformation of Path 

This idea is related to path independence. We may imagine that the path C 2 in (2) was 
obtained from Q by continuously moving C x (with ends fixed!) until it coincides with 
C 2 . Figure 348 shows two of the infinitely many intermediate paths for which the integral 
always retains its value (because of Theorem 2). Hence we may impose a continuous 
deformation of the path of an integral, keeping the ends fixed. As long as our deforming 
path always contains only points at which f(z) is analytic, the integral retains the same 
value. This is called the principle of deformation of path. 
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EXAMPLE 6 


THEOREM 3 


PROOF 



Fig. 348. Continuous deformation of path 


A Basic Result: Integral of Integer Powers 

From Example 6 in Sec. 14.1 and the principle of deformation of path it follows that 


(3) 



2m 

0 


On = -1) 

On # — 1 and integer) 


tor counterclockwise integration around any simple closed path containing Zq in its interior . 

Indeed, the circle \z — Zo\ = p in Example 6 of Sec. 14. 1 can be continuously deformed in two steps into a path 
as just indicated, namely, by first deforming, say, one semicircle and then the other one. (Make a sketch), ■ 


Existence of Indefinite Integral 

We shall now justify our indefinite integration method in the preceding section [formula 
(9) in Sec. 14.1]. The proof will need Cauchy’s integral theorem. 


Existence of Indefinite Integral 

If f(z) is analytic in a simply connected domain D, then there exists an indefinite 
integral F(z) of f(z) in D — thus, F\z) = f(z) — which is analytic in D, and for all 
paths in D joining any two points z 0 and Z\ in D, the integral of f(z) from Zo to Z\ 
can be evaluated by formula (9) in Sec. 14.1. 


The conditions of Cauchy’s integral theorem are satisfied. Hence the line integral of f(z) 
from any Zq in D to any z in D is independent of path in D. We keep z 0 fixed. Then this 
integral becomes a function of z, call if F(z ), 


(4) 


F(z) 


-j > 

Zq 


*) dz* 


which is uniquely determined. We show that this F(z) is analytic in D and F'(z) = /(z). 
The idea of doing this is as follows. Using (4) we form the difference quotient 


(5) 


F(z + A z) ~ F(z) 
Az 


_ 1 _ 

Az 


X 


+ A 2 


/(z*) dz* 


f f(z*)dz* 
J z<> 


Az 


r 


+&Z 

f(z*) dz*. 


We now subtract f(z ) from (5) and show that the resulting expression approaches zero as 
Az — » 0. The details are as follows. 
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We keep z fixed. Then we choose z + Az in D so that the whole segment with 
endpoints z and z + Az is in D (Fig. 349). This can be done because D is a domain, 
hence it contains a neighborhood of z. We use this segment as the path of integration 
in (5). Now we subtract /(z). This is a constant because z is kept fixed. Hence we can 
write 

-z+Az -zH-Az j -z+Az 

I f(z) dz * = /(z) I dz* = /(z) A z. Thus /(z) = — I /(z) dz*. 

J Z J Z a Z J z 


By this trick and from (5) we get a single integral: 


F(z + A z) ~ F(z) 
Az 


-m 


= — f **[ 

Az K 1 


[Kz*) ~ Hz)] dz*. 


Since /(z) is analytic, it is continuous. An e > 0 being given, we can thus find a S > 0 
such that |/(z*) — /(z)| < e when |z* — z| < 5. Hence, letting |Az| < 5, we see that the 
ML-inequality (Sec. 14.1) yields 


F(z + Az) - F(z) 

1 

*z+Az 

Az 

|Az| 

J [/(?*) “ f(z)] dz* 

J Z 




e|Az| = e. 


By the definition of limit and derivative, this proves that 


F\z) = lim 

Az— >0 


F{z + Az) - F(z) 
A z 


= f(z). 


Since z is any point in D, this implies that F(z) is analytic in D and is an indefinite integral 
or antiderivative of f(z) in D, written 


F(z) = Jf(z)dz. 


Also, if G\z) = /(z), then F f (z) — G f (z) = 0 in D\ hence F(z) — G(z) is constant in D 
(see Team Project 26 in Problem Set 13.4). That is, two indefinite integrals of f(z) can 
differ only by a constant. The latter drops out in (9) of Sec. 14.1, so that we can use any 
indefinite integral of /(z). This proves Theorem 3. ■ 



Fig. 349. Path of integration 
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Cauchy’s Integral Theorem for 
Multiply Connected Domains 

Cauchy’s theorem applies to multiply connected domains. We first explain this for a 
doubly connected domain D with outer boundary curve C x and inner C 2 (Fig. 350). If 
a function f(z) is analytic in any domain D* that contains D and its boundary curves, we 
claim that 

(6) <fi f(z ) dz = <£ f(z) dz (Fig. 350) 

J c x J c 2 

both integrals being taken counterclockwise (or both clockwise, and regardless of whether 
or not the full interior of C 2 belongs to D*). 



PROOF By two cuts C 1 and C 2 (Fig. 351) we cut D into two simply connected domains D x and 
D 2 in which and on whose boundaries f(z) is analytic. By Cauchy’s integral theorem the 
integral over the entire boundary of D x (taken in the sense of the arrows in Fig. 351) is 
zero, and so is the integral over the boundary of Z) 2 , and thus their sum. In this sum the 
integrals over the cuts C x and C 2 cancel because we integrate over them in both 
directions — this is the key — and we are left with the integrals over C x (counterclockwise) 
and C 2 (clockwise; see Fig. 351); hence by reversing the integration over C 2 (to 
counterclockwise) we have 

<£ f dz — <£ f dz — 0 
J c x J c 2 

and (6) follows. ■ 


For domains of higher connectivity the idea remains the same. Thus, for a triply connected 
domain we use three cuts C x , C 2 , C 3 (Fig. 352). Adding integrals as before, the integrals 
over the cuts cancel and the sum of the integrals over C x (counterclockwise) and C 2 , C 3 
(clockwise) is zero. Hence the integral over C x equals the sum of the integrals over C 2 
and C 3 , all three now taken counterclockwise. Similarly for quadruply connected domains, 
and so on. 



Fig. 351. Doubly connected domain Fig. 352. Triply connected domain 
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1 1-11 1 CAUCHY’S INTEGRAL THEOREM 
APPLICABLE? 


Integrate /(z) counterclockwise around the unit circle, 
indicating whether Cauchy’s integral theorem applies. 
(Show the details of your work.) 


1. f(z) = Rez 
3. /(-) = e* n 
5. f(z) = tan z z 
7. /(;) = l/fe 8 - 1-2) 
9. f(z) = I/(2|c| 3 ) 

11. f(z) = z z cot z 


2. f(z) = 1/(3 £ - 77/ ) 
4. f(z) = Uz 
6. f(z) = sec a/2) 

8. .f(z) = 1/(4; - 3) 
10. /(z) = f* 


12-17 


COMMENTS ON TEXT AND EXAMPLES 


12. (Singularities) Can we conclude in Example 2 that 
the integral of \/(z 2 + 4) taken over (a) \z — 2| = 2, 
(b) |z — 2| = 3 is zero? Give reasons. 


13. (Cauchy’s integral theorem) Verify Theorem 1 for 
the integral of z 2 over the boundary of the square 
with vertices 1 4- /, — 1 + /, - 1 - i\ and 1 — i 
(counterclockwise). 


14. (Cauchy’s integral theorem) For what contours C will 
it follow from Theorem 1 that 


1 dz f cos z 

(a) f — = 0. (b) <P -g 2 dz = 0. 

J c Z J c z ~~ z 

X e l,z 

10 iFT ?*- 07 

15. (Deformation principle) Can we conclude from 
Example 4 that the integral is also zero over the contour 
in Problem 13? 

16. (Deformation principle) If the integral of a function 
f(z) over die unit circle equals 3 and over the circle 
|z| = 2 equals 5, can we conclude that f(z ) is analytic 
everywhere in the annulus 1 < \z\ < 2? 

17. (Path independence) Verify Theorem 2 for the 
Integral of cos z from 0 to ( I + i)ir (a) over the shortest 
path, (b) over the jc-axis to tt and then straight up to 

(1 i)ir. 


18. TEAM PROJECT. Cauchy’s Integral Theorem. 

(a) Main Aspects. Each of the problems in Examples 
1-5 explains a basic fact in connection with Cauchy’s 
theorem. Find five examples of your own, more 
complicated ones if possible, each illustrating one of 
those facts. 

(b) Partial fractions. Write f(z) in terms of partial 
fractions and integrate it counterclockwise over the unit 
circle, where 


2z 4- 3/ z 4* 1 

(i) = 7TTT (»> = • 

(c) Deformation of path. Review (c) and (d) of Team 
Project 34, Sec. 14.1, in the light of the principle of 
deformation of path. Then consider another family of 
paths with common endpoints, say, z(/) = / 4- ia(t — / 2 ), 
1, and experiment with the integration of analytic 
and nonanaly tic functions of your choice over these paths 
(e.g., z, im z, z 2 , Re z 2 , Im z 2 , etc). 


19-30 1 FURTHER CONTOUR INTEGRALS 

Evaluate (showing the details and using partial fractions if 
necessary) 

19. — — — : , C the circle |z| = 3 (counterclockwise) 

20. <+> tanh z dz. C the circle \z ~ \iri\ = \ (clockwise) 

J c 

21. ^ Re 2 zdz, C as shown 


y 


r 


-i 

i * 


f lz ~ 6 

22. <P -g — — dz , C as shown 
z ~ 2z 



f dz 

23. <P ~ 2 ; C as shown 

Jr Z ~~ 1 



X 

24. dz , C consists of |z| = 2 (clockwise) and |z| = \ 


(counterclockwise) 
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r cos z 

25. <p diy C consists of \z\ = 1 (counterclockwise) 

J c z 

and |c| = 3 (clockwise) 

26. 4> Ln (2 + z)dz> C the boundary of the square with 
J c 

vertices ±1, ±i 

27. f ~ 2 ~ 7 TT • C: (a) |z| = (b) |z - f| = | 

J c Z 1 

(counterclockwise) 


28. , C: (a) |z + i\ = 1, (b) |z - i\ = 1 
Jc Z + 1 

(counterclockwise) 

29. <fi S ' n ^, rfz, C: |z - 4 — 2/| = 5.5 (clockwise) 


30. 


f tan (c/2) 

^cz 4 ~ 


tfz, C the boundary of the square with 


'c z 16 
vertices ±1, ±i (clockwise) 


14 .: Cauchy’s Integral Formula 

The most important consequence of Cauchy’s integral theorem is Cauchy’s integral 
formula. This formula is useful for evaluating integrals, as we show below. Even more 
important is its key role in proving the surprising fact that analytic functions have 
derivatives of all orders (Sec. 14.4), in establishing Taylor series representations 
(Sec. 15.4), and so on. Cauchy’s integral formula and its conditions of validity may be 
stated as follows. 


THEOREM 1 


Cauchy’s Integral Formula 

Let f(z) be analytic in a simply connected domain D. Then for any point Zo in D 
and any simple closed path C in D that encloses z 0 (Fig. 353), 

(1) <£ dz = Imfizo) (Cauchy’s integral formula) 

J cZ- z 0 

the integration being taken counterclockwise. Alternatively (for representing /(z 0 ) 
by a contour integral, divide (1) by 2m ), 

(1*) f(zo) = <fi ^ dz (Cauchy’s integral formula). 

2777 •'c Z Zq 


PROOF By addition and subtraction, f(z ) = f(zo) + [/(z) - /(zo)] - Inserting this into (1) on the 
left and taking the constant factor f(zo ) out from under the integral sign, we have 


( 2 ) 


f 


m 

7 7 ,. 

- ^*0 


dz - f(z 0 ) <f 
c 



+ 4 /w ~ m dz. 


The first term on the right equals f(z 0 ) • 2m (see Example 6 in Sec. 14.2 with m = — 1). 
This proves the theorem, provided the second integral on the right is zero. This is what 
we are now going to show. Its integrand is analytic, except at z 0 - Hence by (6) in 
Sec. 14.2 we can replace C by a small circle K of radius p and center Zo (Fig. 354), without 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 



D 




/ 

J, 

cX 


° *o / 

V^P / / 


V 

K 


Fig. 353. Cauchy's integral formula 


Fig. 354. Proof of Cauchy's integral formula 


altering the value of the integral. Since f(z ) is analytic, it is continuous (Team Project 26, 
Sec. 13.3). Hence an € > 0 being given, we can find a S > 0 such that \f(z) — /(zo)l < 6 
for all z in the disk | z - Zo\ < & Choosing the radius p of K smaller than S, we thus have 
the inequality 


m - f(z 0 ) 


£ 

< — 
p 


Z ~ Zq 

at each point of K. The length of K is 27rp. Hence, by the A/L-inequality in Sec. 14.1, 


J K 


f(z) - /(Zo) 
- So 


dz 


< — 27 rp = 27T6. 

p 


Since e (> 0) can be chosen arbitrarily small, it follows that the last integral in (2) must 
have the value zero, and the theorem is proved. ■ 


Cauchy’s Integral Formula 


^ 0 dz = 27 rie z 


= iTrie 2 = 46.4268/ 


for any contour enclosing zq = 2 (since ^ is entire), and zero for any contour for which zq = 2 lies outside (by 
Cauchy's integral theorem). ■ 


Cauchy’s Integral Formula 



-f - M 


(zq = |/ inside C). ■ 


Integration Around Different Contours 

Integrate 

z 2 + 1 

s(s)= 7^7 


z 2 + I 

(z + lXz - 1) 


counterclockwise around each of the four circles in Fig. 355. 
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Solution . g(z) is not analytic at -1 and 1. These are the points we have to watch for. We consider each 
circle separately. 


(a) The circle \z - l| = 1 encloses the point Zq = 
write 

z 2 + 1 

g(z) = 3 — j" = 


1 where g{z) is not analytic. Hence in ( 1 ) we have to 

z 2 + 1 1 

z + 1 z - I 


thus 


and (1) gives 


/(z) = 


z 2 +l 

2+1 


i 


z 2 + 1 
CJ 2 -1 


dz = 2m/(l) = 2m 



= 2tt/. 


(b) gives the same as (a) by the principle of deformation of path. 

(c) The function g(z) is as before, but f{z) changes because we must take z Q = - 1 (instead of 1 ). This gives 
a factor z - zo = z + lin(l). Hence we must write 


gU) = 


z 2 +l 
2- 1 


1 

2 + 1 : 


thus 


/00 = 


2 2 + 1 
2-1 


Compare this for a minute with the previous expression and then go on: 
2 


f; 2 + 1 z 2 + l 

<P -2 7 dz = 2tt/7(- 1) = 2tt/ — — 

-V z - 1 L ~ 1 J *— 1 


= —2m. 


(d) gives 0 . Why? 



Fig. 355. Example 3 


Multiply connected domains may be handled as in Sec. 14.2. For instance, if f(z) is 
analytic on C x and C 2 and in the ring-shaped domain bounded by Ci and C 2 (Fig. 356) 
and Zo is any point in that domain, then 


(3) 


1 L Hz) , . 1 L Hz) , 

f(z 0 ) = -r— : f dz + — f dz, 

2m J Ct z~ Zo 2 m z - Zq 


where the outer integral (over C{) is taken counterclockwise and the inner clockwise, as 
indicated in Fig. 356. 
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Fig. 356. Formula (3) 

Our discussion in this section has illustrated the use of Cauchy’s integral formula in 
integration. In the next section we show that the formula plays the key role in proving 
the surprising fact that an analytic function has derivatives of all orders, which are thus 
analytic functions themselves. 


~g:R:C>:BEEM^SErT— V-4 


(T7j CONTOUR INTEGRATION 

Integrate (z 2 — 4)/(z 2 + 4) counterclockwise around the 
circle: 

1. \z — i| = 2 2. |z — 1 1 = 2 

3. | z + 3*| = 2 4. \z\ = 7r/2 

5-17 1 CONTOUR INTEGRATION 

Using Cauchy’s integral formula (and showing the details), 
integrate counterclockwise (or as indicated) 


dz, C: z = I 


f z + 2 

5. f -dz, C: |z — 1 1 = 2 

J c z l 

, r e 3z . 

6. f r dz, C: z = 1 

J c 3z-i 

C sinh ttz , , 

8. f -£-r . C: k - 1| — «ra 

c z 1 

C:\z+l\ = l 

J c z 1 

1°. <f> — dz, C: | z - 2/ 1 = 4 

c z ~ 2l 

U. j> ^ dz, C:\zl 


2/1 = 4 


dz, C:\z\ 


13. <P — — — dz, C the boundary of the square with 
J c 2z + ' 

vertices ±1, ±i 

14. <£ — - dz , C consists of \z — 2i\ = 2 

c ^ i 1 

(counterclockwise) and | z — 2i\ = \ (clockwise) 

15. Ln {Z ~ 1} dz, C: |z - 4| = 2 
c z •* 

16. <P - 2 — Z dz, C consists of | z\ =3 (counterclockwise) 


and |z| = 1 (clockwise) 


r cosh 2 z 

17. r 7 ^ r - o dz, C as in Prob. 16 

J c (z - 1 - i)z 

18. Show that £ (z~ Zi)~\z — Z2) -1 dz = 0 for a simple 

c 

closed path C enclosing z x and z 2 > which are arbitrary. 

19. CAS PROJECT. Contour Integration. Experiment 
to find out to what extent your CAS can do contour 
integration (a) by using the second method in Sec. 14. 1 , 
(b) by Cauchy’s integral formula. 

20. TEAM PROJECT. Cauchy’s Integral Theorem. 
Gain additional insight into the proof of Cauchy’s 
integral theorem by producing (2) with a contour 
enclosing zo (as in Fig. 353) and taking the limit as in 
the text. Choose 


t ( ^ f z 3 — 6 ^ ^ f sin z j 

12. <£ ^—-7 dz, C the boundary of the triangle with z — 5/ ^ z — \tt 


vertices 0 and ±1 +2 i 


and (c) two other examples of your choice. 
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14.4 Derivatives of Analytic Functions 

In this section we use Cauchy’s integral formula to show the basic fact that complex 
analytic functions have derivatives of all orders . This is very surprising because it differs 
strikingly from the situation in real calculus. Indeed, if a real function is once 
differentiable, nothing follows about the existence of second or higher derivatives. Thus, 
in this respect, complex analytic functions behave much more simply than real functions 
that are once differentiable. 

The existence of those derivatives will result from a general integral formula, as follows. 


THEOREM 1 


Derivatives of an Analytic Function 

If f(z) is analytic in a domain D, then it has derivatives of all orders in D, which 
are then also analytic functions in D. The values of these derivatives at a point z 0 
in D are given by the formulas 


d') 

d") 


and in general 

( 1 ) 


f'(z 0 ) = t— : 4 , s 

2m J (z - Zo) 


m 


dz 


r'V v 21 -f , 

f (z »> “ m 1 77^? * 


2m J c (z - Zo) 


f(n> (7 )m -2LS M d7 

f {Zo) 2m J c (z - zo) n+1 


(« = I. 2, ■ ■ •); 


here C is any simple closed path in D that encloses Zq and whose full interior belongs 
to D; and we integrate counterclockwise around C (Fig. 357). 



Fig. 357. Theorem 1 and its proof 


COMMENT. For memorizing (1), it is useful to observe that these formulas are obtained 
formally by differentiating the Cauchy formula (1*), Sec. 14.3, under the integral sign 
with respect to Zq- 




SEC 14.4 Derivatives of Analytic Functions 


659 


PROOF We prove (1 '), starting from the definition of the derivative 


f\z 0 ) = lim 

A2->0 


f(zp + Az) - /(zp) 

Az 


On the right we represent /(zo + A z) and /(z 0 ) by Cauchy’s integral formula: 

+ l fr M 4 iJLJ. 

A z 2mAz [J c z - (z 0 + A z) z - z 0 J 

We now write the two integrals as a single integral. Taking the common denominator 
gives the numerator f(z){z — Zq — [z - (zo + Az)]} = /(z) Az, so that a factor Az drops 
out and we get 

/(zp + Az) ~ f(z Q ) = J_ r /(z) ^ 

Az 2m J c (z - z 0 ~ A z)(z - z 0 ) 

Clearly, we can now establish (l') by showing that, as Az— > 0, the integral on the right 
approaches the integral in (1 '). To do this, we consider the difference between these two 
integrals. We can write this difference as a single integral by taking the common 
denominator and simplifying the numerator (as just before). This gives 


4 M dz -& M dz = 4 

■'c “ *o - Az)(z - Zo) * c (z - Zo f J- 


f(z)Az 


c (z - Zo - Az)(z - zo) 2 


dz. 


We show by the ML-inequality (Sec. 14.1) that the integral on the right approaches zero 
as Az — > 0. 

Being analytic, the function /(z) is continuous on C, hence bounded in absolute value, 
say, |/(z)| = K. Let d be the smallest distance from Zq to the points of C (see Fig. 357). 
Then for all z on C, 


| z “ z 0 | 2 = d 2 , hence 


1 < J_ 

\z - Zo\ Z ~ d 2 ' 


Furthermore, by the triangle inequality for all z on C we then also have 

d ^ |z - z 0 \ = \z — Zq - Az + Az| ^ \z - z 0 - Az| + |Az|. 

We now subtract |Az| on both sides and let |Az| = d! 2, so that — |Az| ~ — d/2, Then 


\dt=k d — |Az| = \z — Zq “ Az|. Hence 


1 2 

^ — . 

| z - Zo “ Az| d ’ 


Let L be the length of C. If |Az| = d/2, then by the ML-inequality 

f(z)Az 


f £ 

(.7 — 7 /> — 


(z - z 0 - A zXz - ZoY 


dz 


sa «J7- 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


This approaches zero as A z 0. Formula (1 ') is proved. 

Note that we used Cauchy’s integral formula (1*), Sec. 14.3, but if all we had known 
about f(zo) is the fact that it can be represented by (1*), Sec. 14.3, our argument would 
have established the existence of the derivative f'(Zo) of /(z). This is essential to the 
continuation and completion of this proof, because it implies that (l") can be proved by 
a similar argument, with / replaced by and that the general formula (1) follows by 
induction. ■ 


Evaluation of Line Integrals 

From (l'), for any contour enclosing the point m (counterclockwise) 


f 


cos z 


(z - m)‘ 


72 dz = 2777 (cos 


= —2m sin m = 2tt sinh tt. 


From (1"), for any contour enclosing the point — / we obtain by counterclockwise integration 


- 

o 

- 


: -3z 2 + 6 
(z + if 


dz = iri(z 4 — 3 z 2 + 6)" - iri[l2z 2 — 6] z __,- = — I87 t/. 


By (l')» for any contour for which I lies inside and ±2/ lie outside (counterclockwise), 

-2 Vi 


c (z - 1)V + 4) 


& -“(?T7)L 

f(z 2 + 4) — e z 2z 


= 2t ri- 


te 2 + 4 ) 2 


6e7T 

~25 


i 2.050/. 


Cauchy’s Inequality. Liouville's and Morera's Theorems 

As a new aspect, let us now show that Cauchy’s integral theorem is also fundamental in 
deriving general results on analytic functions. 

Cauchy’s Inequality. Theorem 1 yields a basic inequality that has many applications. 
To get it, all we have to do is to choose for C in (1) a circle of radius r and center Zo and 
apply the ML-inequality (Sec. 14.1); with |/(z)| ^ M on C we obtain from (1) 


rwi = £ 


f. 


f(z ) 


- t-Y* +1 


c (z - z 0 y 


dz 


n! 1 

— — M — 7TT 2 t rr. 
2tt F* +1 


This gives Cauchy’s inequality 


( 2 ) 




«!M 


To gain a first impression of the importance of this inequality, let us prove a famous 
theorem on entire functions (definition in Sec. 13.5). (For Liouville, see Sec. 5.7.) 
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THEOREM 2 


Liouville’s Theorem 

If an entire function is bounded in absolute value in the whole complex plane, then 
this function must be a constant 


PROOF By assumption, |/(z)| is bounded, say, |/(z)| < K for all z. Using (2), we see that 
\f'(z 0 )\ < Kir. Since f(z ) is entire, this holds for every /*, so that we can take r as large 
as we please and conclude that f(zo) = 0. Since z 0 is arbitrary, f'(z) = u x + iv x = 0 
for all z (see (4) in Sec. 13.4), hence u x = v x = 0, and u y = v y = 0 by the Cauchy-Riemann 
equations. Thus u = const, v = const, and f = u + iv = const for all z. This completes 
the proof. ■ 


Another very interesting consequence of Theorem 1 is 


THEOREM 3 


Morera’s 2 Theorem (Converse of Cauchy’s Integral Theorem) 

If f(z) is continuous in a simply connected domain D and if 

(3) f.f(z)dz = 0 

J c 

for every closed path in D, then f(z) is analytic in D. 


PROOF In Sec. 14.2 we showed that if /(z) is analytic in a simply connected domain D, then 

F(z) = f f(z*) dz* 

J *o 

is analytic in D and F\z) = /(z). In the proof we used only the continuity of /(z) and the 
property that its integral around every closed path in D is zero; from these assumptions 
we concluded that F(z) is analytic. By Theorem 1, the derivative of F(z) is analytic, that 
is, /(z) is analytic in D, and Morera’s theorem is proved. ■ 



1-8 CONTOUR INTEGRATION 

e 2 cos z 

cosz 

Integrate counterclockwise around the circle |z| = 2. (n is 

3 ‘ (z - irilf 


a positive integer, a is arbitrary.) Show the details of your 

sinh az 

^ Ln (z + 3) + cos z 

work. 

5 - z 4 



cosh 3z sin z z n e* 

z 5 2 * (z - tt//2) 4 7m (z ~ a) n+1 8 ’ (z - a) n 


2 G1ACINT0 MORERA (1856-1909), Italian mathematician who worked in Genoa and Turin. 
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9-13 1 INTEGRATION AROUND DIFFERENT 
CONTOURS 

Integrate around C. Show the details. 


(1 + 2 z) cos z 
(2z - l) 2 


, C the unit circle, counterclockwise 


sin4z 

10. ^3 , C consists of \z\ = 5 (counterclockwise) 

and \z — 3 1 = § (clockwise) 

tan irz 

11. — ~2 — , C the ellipse I6,v^ 4- y — 1 , counterclockwise 


12. — — j , C consists of \z — i | = 3 (counterclockwise) 

z(,z — 2 1 ) 


and \z\ = 1 (clockwise) 


~t , C the circle \z — 2 — / = 3, counterclockwise 

( z - ay 


14. TEAM PROJECT. Theory on Growth 

(a) Growth of entire functions. If f(z) is not a 
constant and is analytic for all (finite) z, and R and M 
are any positive real numbers (no matter how large), 
show that there exist values of z for which |z| > R and 
|/(z)| > M. 

(b) Growth of polynomials. If /(z) is a polynomial 
of degree n > 0 and M is an arbitrary positive real 
number (no matter how large), show that there exists 
a positive real number R such that |/(z)| > M for all 
\z\ >R. 

(c) Exponential function. Show that f(z) = e z has 
the property characterized in (a) but does not have that 
characterized in (b). 

(d) Fundamental theorem of algebra. If ,f(z) is a 
polynomial in z, not a constant . , then /(z) = 0 for at 
least one value of z. Prove this, using (a). 

15. (Proof of Theorem 1) Complete the proof of Theorem 
1 by performing the induction mentioned at the end. 


ECT3EJEFEB=rS3EE5eE»EEiqFQEESTIONS AND PROBLEMS 


1. What is a path of integration? What did we assume 
about paths? 

2. State the definition of a complex line integral from 
memory. 

3. What do we mean by saying that complex integration 
is a linear operation? 

4. Make a list of integration methods discussed. Illustrate 
each with a simple example. 

5. Which integration methods apply to analytic functions 
only? 

6. What value do you get if you integrate 1/z 
counterclockwise around the unit circle? (You should 
memorize this basic result.) If you integrate 1/z 2 , 
1/z 3 , • • • ? 

7. Which theorem in this chapter do you regard as most 
important? State it from memory. 

8. What is independence of path? What is the principle of 
deformation of path? Why is this important? 

9. Do not confuse Cauchy’s integral theorem and Cauchy’s 
integral formula. State both. How are they related? 

10. How can you extend Cauchy’s integral theorem to 
doubly and triply connected domains? 

11. If integrating f(z) over the boundary circles of an 
annulus D gives different values, can /(z) be analytic 
in D1 (Give reason.) 

12. Is |j" /(z)r/z| = J \f{z)\dzl How would you find a 
bound for the integral on the left? 


13. Is Re I /(z) dz = I Re /(z) dzl Give examples. 

J c J c 

14. How did we use integral formulas for derivatives in 
integration? 

15. What is Liouville’s theorem? Give examples. State 
consequences. 

1 6-30 1 INTEGRATION 

Integrate by a suitable method: 

16. 4z 3 4 2z from — t to 2 4 i along any path 

17. 5z — 3/z counterclockwise around the unit circle 

18. z 4 1/z counterclockwise around |z 4 3/| = 2 

19. e Zz from —2 4 3m along the straight segment to 
-2 4 5m 

20 . <?* 2 /(z - 1 ) 2 counterclockwise around |z| = 2 

21. zKz 2 4 1) clockwise around |z 4 /| = 1 

22. Re z from 0 to 4 and then vertically up to 4 4 3 i 

23. cosh 4z from 0 to 2 i along the imaginary axis 

24. e z /z over C consisting of |z| = 1 (counterclockwise) and 
|z| = I (clockwise) 

25. (sin z)/z clockwise around a circle containing z = 0 in 
its interior 

26. Im z counterclockwise around |z| = r 

27. (Ln z)/(z - 2/) 2 counterclockwise around \z — 2i\ = 1 

28. (tan 7 rz)/(z — l) 2 counterclockwise around \z — l| = 0.2 

29. |z| + z clockwise around the unit circle 

30. (z — i)" 3 (z 3 4 sinz) counterclockwise around any 
circle with center / 





The complex line integral of a function f(z) taken over a path C is denoted by 
(1) J f(z ) dz or, if C is closed, also by <j> f(z) (Sec. 14.1). 


If f(z) is analytic in a simply connected domain D, then we can evaluate (1) as in 
calculus by indefinite integration and substitution of limits, that is. 


J f(z) dz = F(zi) - F(z 0 ) 


L F'(z ) = f(z)] 


for every path C in D Fi om a point zo to a point Zi (see Sec. 14.1). These assumptions 
imply independence of path, that is, (2) depends only on zo and Z\ (and on f(z), 
of course) but not on the choice of C (Sec. 14.2). The existence of an F(z) such that 
F\z ) = f(z) is proved in Sec. 14.2 by Cauchy’s integral theorem (see below). 

A general method of integration, not restricted to analytic functions, uses the 
equation z = z(t) of C, where a ^ t ^ b. 


J f(z ) dz = J f(z(t))z(t) dt (z = -^) ■ 


Cauchy’s integral theorem is the most important theorem in this chapter. It states 
that if f(z) is analytic in a simply connected domain £>, then for every closed path 
CinD (Sec. 14.2), 


ff(z)dz = 0. 


Under the same assumptions and for any z 0 in D and closed path C in D containing 
Zq in its interior we also have Cauchy’s integral formula 


, 1 1 f(z) . 

f(z o) = ~zr~. f dz. 

ATT l J — 7 1\ 


/ j v«u/ /% . i 

2m J c z- z 0 

Furthermore, under these assumptions f(z) has derivatives of all orders in D that 
are themselves analytic functions in D and (Sec. 14.4) 


/n^ = dz o. - 1. 2 . • • •). 


This implies Morera’s theorem (the converse of Cauchy’s integral theorem) and 
Cauchy's inequality (Sec. 14.4), which in turn implies Liouville's theorem that an 
entire function that is bounded in the whole complex plane must be constant. 





CHAPTER 1 5 

Power Series, Taylor Series 


Complex power series, in particular, Taylor series, are analogs of real power and Taylor 
series in calculus. However, they are much more fundamental in complex analysis than 
their real counterparts in calculus. The reason is that power series represent analytic 
functions (Sec. 15.3) and, conversely, every analytic function can be represented by power 
series, called Taylor series (Sec. 15.4). 

Use Sec. 15.1 for reference if you are familiar with convergence tests for real series — 
in complex this is quite similar. The last section (15.5) on uniform convergence is optional. 

Prerequisite: Chaps. 13, 14. 

Sections that may be omitted in a shorter course: 14.1, 14.5. 

References and Answers to Problems: App. 1 Part D, App. 2. 


15 . Sequences, Series, Convergence Tests 

In this section we define the basic concepts for complex sequences and series and discuss 
tests for convergence and divergence. This is very similar to real sequences and series in 
calculus. If you feel at home with the latter and want to take for granted that the ratio 
test also holds in complex , skip this section and go to Sec. 15.2. 

Sequences 

The basic definitions are as in calculus. An infinite sequence or, briefly, a sequence, is 
obtained by assigning to each positive integer n a number z n , called a term of the sequence, 
and is written 


Zi. z& • • • or {z lt z 2 , • • •} or briefly { z n }. 

We may also write z 0 , z*, • • • or s 2 , £ 3 , • • • or start with some other integer if convenient. 
A real sequence is one whose terms are real. 

Convergence. A convergent sequence Zi, z 2 * • * * is one that has a limit c , written 

lim z n = c or simply z n — » c. 

n —* 00 

By definition of limit this means that for every e > 0 we can find an N such that 
(U | z n — c\< e for all n > N; 


664 


SEC 15.1 Sequences, Series, Convergence Tests 


665 


geometrically, all terms z n with n > N lie in the open disk of radius € and center c 
(Fig. 358) and only finitely many terms do not lie in that disk. [For a real sequence, (1) 
gives an open interval of length 2e and real midpoint c on the real line; see Fig. 359.] 

A divergent sequence is one that does not converge. 



x c-e c c+e x 


Fig. 358. Convergent complex sequence Fig. 359. Convergent real sequence 


EXAMPLE 1 Convergent and Divergent Sequences 

The sequence {/ tt /«) = {/, —172, — //3, 1/4, • • •} is convergent with limit 0. 

The sequence {/ n j = {«, -1, -i, 1, • • •) is divergent, and so is { z n ) with z n = (1 + /) w . ■ 

EXAMPLE 2 Sequences of the Real and the Imaginary Parts 

The sequence {z n } with z n - + iy n = 1 — 1 frt 2 + /( 2 + 4 In) is 6 /, 3/4 + 4/, 8/9 4- 10/73, 15/16 + 3/, • • • . 

(Sketch it.) It converges with the limit c = 14- 2/. Observe that {A n } has the limit I = Re c and {y n } has the 
limit 2 = Im c. This is typical. It illustrates the following theorem by which the convergence of a complex 
sequence can be referred back to that of the two real sequences of the real parts and the imaginary parts. B 


THEOREM 1 Sequences of the Real and the Imaginary Parts 

A sequence * • * , z n , ■ • * of complex numbers z n = x n 4- [y n (where 

n = 1, 2, • • •) converges to c = a 4- ib if and only if the sequence of the real parts 
a' 2 , * * * converges to a and the sequence of the imaginary parts y 2 > ' * * 
converges to b. 


PROOF Convergence z n -^ c = a + ib implies convergence x n — > a and y n —>b because if 
| z n ~ 6*| < e, then z n lies within the circle of radius e about c = a 4- ib , so that 
(Fig. 360 a) 

k* - a| < e > bn - b\< e. 



(a) (b) 

Fig. 360. Proof of Theorem 1 
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Conversely, if x n a and y n — > b as n then for a given e > 0 we can choose N 
so large that, for every n > /V, 

K ~ a \ < j . b« - b\ < . 

These two inequalities imply that z n = + iy n lies in a square with center c and side 

6. Hence, z n must lie within a circle of radius e with center c (Fig. 360b). ■ 


Series 

Given a sequence z x , • * * , Zm, • * * , we may form the sequence of the sums 
= Zi, s 2 = 41 + Z 2 , s 3 = Zl + Z-2 + Zz, 

and in general 

(2) = Zi + ^2 + ' * • + (/? = 1, 2, • • •)• 

s n is called the wth partial sum of the infinite series or series 

DC 

( 3 ) li, Zm = Zi+ Z 2 + ' ' 4 ■ 

m - 1 

The zi, z 2 , * ' • are called the terms of the series. (Our usual summation letter is n, 
unless we need n for another purpose, as here, and we then use m as the summation 
letter.) 

A convergent series is one whose sequence of partial sums converges, say, 

oc 

lim s n = s. Then we write s = 2 -m = £i + -2 + ' ' * 

n-+ 00 

m— 1 

and call s the sum or value of the series. A series that is not convergent is called a divergent 
series. 

If we omit the terms of s n from (3), there remains 

(4) R„ = Z n +1 + z n + 2 + Z„+3 + • • * . 

This is called the remainder of the series (3) after the term z n . Clearly, if (3) converges 
and has the sum s, then 


s = s n + Rn, thus R n = s - s n . 

Now s n -» s by the definition of convergence; hence R n — > 0. In applications, when 5 is 
unknown and we compute an approximation s n of s, then |/? n | is the error, and R n —> 0 
means that we can make |/? n | as small as we please, by choosing n large enough. 

An application of Theorem 1 to the partial sums immediately relates the convergence 
of a complex series to that of the two series of its real parts and of its i maginar y parts: 
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THEOREM 2 


THEOREM 3 


PROOF 


THEOREM 4 


Real and Imaginary Parts 

A series (3) with z m = x m + iy m converges and has the sum s = u + iv if and only 
ifx 1 + x 2 + • * * converges and has the sum u and y x + y 2 + * • • converges and. 
has the sum v. 


Tests for Convergence and Divergence of Series 

Convergence tests in complex are practically the same as in calculus. We apply them 
before we use a series, to make sure that the series converges. 

Divergence can often be shown very simply as follows. 


Divergence 

If a series Z\ + z 2 + ' ' ‘ converges, then lim z m = 0- Hence if this does not hold, 
the series diverges. 


If zi + z 2 + * * * converges, with the sum s , then, since z, n — s m — s m - lt 

lim z m = lim (s m - s m ^ x ) = lim s m - lim s m ^ x = s - s = 0. ■ 

m-*oc iti — ►c© oo oc 

CAUTION! Zm 0 is necessary for convergence but not sufficient , as we see from the 
harmonic series l +2 + 3 + 4 + ,, ’> which satisfies this condition but diverges, as is 
shown in calculus (see, for example. Ref. [GR 11] in App. 1). 

The practical difficulty in proving convergence is that in most cases the sum of a series 
is unknown. Cauchy overcame this by showing that a series converges if and only if its 
partial sums eventually get close to each other: 


Cauchy’s Convergence Principle for Series 

A series Z\ + z 2 + * * * is convergent if and only if for every given e > 0 (no matter 
how small) we can find an N (which depends on e, in general) such that 

(5) \z n + 1 + z n + 2 + • • • + z n +p\ < € for every n > N and p = 1, 2, • • • 


The somewhat involved proof is left optional (see App. 4). 

Absolute Convergence. A series zi + z 2 + * * ' is called absolutely convergent if the 
series of the absolute values of the terms 

oo 

2 \zm\ = Izil + tal + • • • 

m=l 

is convergent. 

If Zi + z 2 + • • • converges but + \zz\ + • • • diverges, then the series Zi + Zz + • • • 
is called, more precisely, conditionally convergent. 
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EXAMPLE 3 


THEOREM 5 


PROOF 


THEOREM 6 


PROOF 


A Conditionally Convergent Series 

The series I — | + converges, but only conditionally since the harmonic series diverges, as 
mentioned above (after Theorem 3). M 

If a series is absolutely convergent , it is convergent . 

This follows readily from Cauchy’s principle (see Team Project 30). This principle also 
yields the following general convergence test. 


Comparison Test 

If a series Z\ + Z 2 + • * ’ is given and we can find a convergent series b x + b 2 + • * * 
with nonnegative real terms such that |zj ^ b l9 \z 2 \ = & 2 > * * * * ^ ien given series 
converges, even absolutely. 


By Cauchy’s principle, since b x + b 2 + • • • converges, for any given e > 0 we can find 
an N such that 

b n+ 1 + • • • + b n + p < e for every n > N and p = 1, 2, • • • . 

From this and \z x \ = b x , \z 2 \ = fc 2 , * * • we conclude that for those n and p , 

l^n+ll kn+pl — ^n+1 “1“ * * * “l” b n + p < €. 

Hence, again by Cauchy’s principle, \z x \ + \z 2 \ + • • • converges, so that z x + z 2 + * # • 
is absolutely convergent. ■ 

A good comparison series is the geometric series, which behaves as follows. 


Geometric Series 

The geometric series 

00 

(6*) 2 = 1 + <7 + q 2 + ■ ■ • 

m=0 

converges with the sum 1/(1 - 4 ) if\q\ < 1 and diverges if\q\ = 1 . 


If |#| ^ 1, then |g m | ^ 1 and Theorem 3 implies divergence. 

Now let \q\ < 1. The nth partial sum is 

s n = 1 + q + * • • + q n . 

From this, 

qs n = q + • • • + q n + q n+1 . 

On subtraction, most terms on the right cancel in pairs, and we are left with 
$n QS n (1 ~ q)Sn ~ 1 Q n 
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THEOREM 7 


PROOF 


Now 1 - q # 0 since q # 1, and we may solve for s n , finding 


( 6 ) 





Since \q\ < 1, the last term approaches zero as n — > «. Hence if |?| < 1, the series is 
convergent and has the sum 1/(1 — q). This completes the proof. ■ 


Ratio Test 

This is the most important test in our further work. We get it by taking the geometric 
series as comparison series b x + b 2 + • • • in Theorem 5: 


Ratio Test 

If a series + z 2 + ' ‘ ' with z n ¥= 0 (n = 1, 2, • • •) has the property that for 
every n greater than some N, 


(7) 


Zn+i 


g?<l 


(n> N) 


{where q < 1 is fixed), this series converges absolutely. If for every n> N, 


( 8 ) 


Zn+1 

Zn 


{n > N), 


the series diverges. 


If (8) holds, then |z n+1 | g \z n \ for n > N, so that divergence of the series follows from 
Theorem 3. 

If (7) holds, then |z n+1 | |z n | q for n > N, in particular, 

Nn+21 — Un+iI#’ I Zn + 3 I = |zN+2k = |zjv+ik 2 > e * c, » 
and in general, |z N+p | = |zw+ik p-1 - Since 9 < 1, we obtain from this and Theorem 6 

kw+il + hv+2l + hv+3l + • • • = (i + q + <? 2 + • * •) = |zn+i! ^ _ • 


Absolute convergence of z\ + z 2 + * * * now follows from Theorem 5. ■ 

CAUTION! The inequality (7) implies |z M +i^nl < U but this does not imply 
convergence, as we see from the harmonic series, which satisfies z n+ ik n = n/(n + 1) < 1 
for all n but diverges. 

If the sequence of the ratios in (7) and (8) converges, we get the more convenient 
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THEOREM 8 


PROOF 


EXAMPLE 4 


Ratio Test 

%n+ 1 


If a series Z\ + z 2 + • • • with z n =£ 0 (n = 1, 2, • • •) is such that lim 

= L, 

TV— 

then: 

(a) If L < 1, the series converges absolutely . 

(b) If L > 1, the series diverges. 

z n 


(c) If L = 1, the series may converge or diverge , so that the test fails and 
permits no conclusion. 


(a) We write kn = |z n +i^nl and let L = 1 — b < 1. Then by the definition of limit, the 
k n must eventually get close to 1 — 6, say, k n ^ q = 1 — \b < 1 for all n greater than 
some N. Convergence of Zi + Z 2 + * ' * now follows from Theorem 7. 

(b) Similarly, for L = 1 + c > 1 we have ^ 1 4- \c > 1 for all n > N* (sufficiently 
large), which implies divergence of Z\ + z 2 + ‘ * * by Theorem 7. 

(c) The harmonic series 1 + \ + § + ••• has z n+ i/z w = ttHjn + 1). hence L = 1, and 
diverges. The series 


1 



_1_ J_ 
16 + 25 


+ • • • 


has 


_ n 

Zn (n + l ) 2 ’ 


hence also L = 1, but it converges. Convergence follows from (Fig. 361) 


1 1 r n dx 1 

= 1 + T + -- - + - ? S1+I -g- = 2 , 

4 /r •'i x n 


so that * * * is a bounded sequence and is monotone increasing (since the terms of 
the series are all positive); both properties together are sufficient for the convergence of 
the real sequence s l9 s 2 , • • • . (In calculus this is proved by the so-called integral test, 
whose idea we have used.) ■ 



Ratio Test 

Is the following series convergent or divergent? (First guess, then calculate.) 


£ (I00 + 75/) n 1 

2 — t = I + (100 + 75/) + — (100 + 75/) 2 + • • • 

1=0 
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EXAMPLE 5 


THEOREM 9 


PROOF 


THEOREM 10 


Solution . By Theorem 8 , the series is convergent, since 


Zn + 1 
Z» 


|100 + 75/| n+1 /(» + i)! _ 1 100 + 75/| _ 125 

|100 + 75(| ra /n! ~ «+l _ «+l 


L = 0. ■ 


Theorem 7 More General than Theorem 8 

Let a n = it 2 s " and b n = 1/2 3to+1 , Is the following series convergent or divergent? 


ciq + b 0 + cii + b\ + ■ 


I i 

~ 1 + 2 + 8 


16 


+ 64 


1 

128 


Solution . The ratios of the absolute values of successive terms are 5 , * « • . Hence convergence follows 
from Theorem 7. Since the sequence of these ratios has no limit, Theorem 8 is not applicable. M 


Root Test 

The ratio test and the root test are the two practically most important tests. The ratio test 
is usually simpler, but the root test is somewhat more general. 


Root Test 


If a series z± + z 2 + 

• • is such that for every n greater than some N, 

(9) 

§?<1 (n > A/) 

( where q < 1 is fixed), this series converges absolutely. If for infinitely many n, 

(10) 

the series diverges. 

N 1 
^ | 

IIV 


If (9) holds, then \z n \ = q n < 1 for all n > N. Hence the series \zi\ + |z 2 | + * ’ ’ converges 
by comparison with the geometric series, so that the series z± + z 2 + ' ' ' converges 
absolutely. If (10) holds, then |z n | ^ 1 for infinitely many n. Divergence of Z\ + z 2 + * * * 
now follows from Theorem 3. ■ 

CAUTION! Equation (9) implies ^f \ zj < 1, but this does not imply convergence, as 
we see from the harmonic series, which satisfies ^Y\fn < 1 (for n > 1) but diverges. 

If die sequence of the roots in (9) and (10) converges, we more conveniently have 


Root Test 

If a series Zi + z 2 + ‘ ‘ ‘ is such that lim = L, then: 

71^*00 

(a) The series converges absolutely if L < 1. 

(b) The series diverges if L> 1. 

(c) If L = 1, the test fails; that is, no conclusion is possible. 
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PROOF The proof parallels that of Theorem 8. 

(a) Let L = 1 — a* < 1. Then by the definition of a limit we have 

< q = 1 - < 1 for all n greater than some (sufficiently large) N*. Hence 

\z n \ < q n < 1 for all /z > N*. Absolute convergence of the series ^ -h z 2 + • ■ • now 
follows by the comparison with the geometric series. 

(b) If L > 1 , then we also have VjzJ > 1 for all sufficiently large n. Hence \z n \ > 1 
for those n. Theorem 3 now implies that Z\ + z 2 + * * ‘ diverges. 

(c) Both the divergent harmonic series and the convergent series 

1+| + | + ^ - give L = 1. This can be seen from (In n)/n — > 0 and 


n/T = _L = 1 J_ 

V n n lln e <UrOlnn ^0 


nf 1 = JL = 1 

A/ ^2 ^2/n ^(2/n) In n 




1-10 


SEQUENCES 


Are the following sequences Zi, z 2 , ' ’ * > z n , • • • bounded? 
Convergent? Find their limit points. (Show the details of 


your work.) 

1. In = C— l) n + //2 n 
3. z n = (-1 ) n /(w 4 /) 
5. z n = Ln((2 + /) n ) 

7. = sin («7t/4) 4- i n 

9. z n = (0.9 4 0.1 /) 2n 


2. z n = <T nwi ' 4 
4. z n = (I 4 i) n 
6. z n = (3 4 4/) w /rt! 

8. = [(1 + 3/)/Vl0] n 

10. Z n = (. 5 + 5 /)-" 


11. Illustrate Theorem 1 by an example of your own. 

12. (Uniqueness of limit) Show that if a sequence 
converges, its limit is unique. 

13. (Addition) If Zi, z 2 , * * * converges with the limit / and 
Zi*, z 2 *, * ’ * converges with the limit /*. show that 
h + Zi*, z 2 4 z 2 *. * * * converges with the limit / 4 /*. 

14. (Multiplication) Show that under the assumptions of 
Prob. 13 the sequence ziZi*, z 2 Z 2 *» ' ’ ' converges 
with the limit //*. 


15. (Boundedness) Show that a complex sequence is 
bounded if and only if the two corresponding sequences 
of the real parts and of the imaginary parts are bounded. 


16-24 


SERIES 


Are the following series convergent or divergent? (Give a 
reason.) 


(10 ~ 15/) n 

i®* 2/ 1 


n=0 

;n 


m-2-4 


n=0 


n 2 - 2 i 


00 * 

20. 2 -±- 

n=2 ,n " 


„ £ (-D”(l + 2 /) 2n+1 

17. 2 j 


rt-0 


(2n + 1)! 


19.2 4= 

Vn 


21 . 2 - 


oc j n 

n 


n-1 



23. 2 

n«0 


« - / 

3 n 4 2/ 


25. What is the difference between (7) and just stating 

|z«+i/z»l < 1? 

26. Illustrate Theorem 2 by an example of your choice. 

27. For what n do we obtain the term of greatest absolute 
value of the series in Example 4? About how big is it? 
First guess, then calculate it by the Stirling formula in 
Sec. 24.4. 

28. Give another example showing that Theorem 7 is more 
general than Theorem 8. 

29. CAS PROJECT. Sequences and Series, (a) Write a 
program for graphing complex sequences. Apply it to 
sequences of your choice that have interesting 
“geometrical” properties (e.g., lying on an ellipse, 
spiraling toward its limit, etc.). 

(b) Write a program for computing and graphing 
numeric values of the first n partial sums of a series 
of complex numbers. Use the program to experiment 
with the rapidity of convergence of series of your 
choice. 

30. TEAM PROJECT. Series, (a) Absolute convergence. 
Show that if a series converges absolutely, it is 
convergent. 

(b) Write a short report on the basic concepts and 
properties of series of numbers, explaining in each case 
whether or not they carry over from real series 
(discussed in calculus) to complex series, with reasons 
given. 
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(c) Estimate of the remainder. Let |z n+ i/z n | = <7 < L 
so that the series z\ + z 2 + * * * converges by the ratio 
test. Show that the remainder R n = z n + 1 + z n + % + * • * 
satisfies the inequality \R n \ ^ bwil/O - <?)• 

(d) Using (c), find how many terms suffice for 
computing the sum s of the series 


“ 2 n n 

n = 1 

with an error not exceeding 0.05 and compute s to this 
accuracy. 

(e) Find other applications of the estimate in (c). 


15.2 Power Series 


Power series are the most important series in complex analysis because we shall see that 
their sums are analytic functions, and every analytic function can be represented by power 
series (Theorem 5 in Sec. 15.3 and Theorem 1 in Sec. 15.4). 

A power series in powers of z — Zo is a series of the form 


00 

(1) 2 «n(Z - Zo)” = «0 + «l(z “ Zo) + a 2 (Z - Zo) 2 + • • 4 

n= 0 


where z is a complex variable, a 0) a ly • • • are complex (or real) constants, called the 
coefficients of the series, and Zq is a complex (or real) constant, called the center of the 
series. This generalizes real power series of calculus. 

If z 0 = 0, we obtain as a particular case a power series in powers of z: 


( 2 ) 


00 

2 a«z” = a 0 + a x z 4- a 2 z 2 + • * * . 

n=0 


Convergence Behavior of Power Series 

Power series have variable terms (functions of z)> but if we fix z, then all the concepts 
for series with constant terms in the last section apply . Usually a series with variable 
terms will converge for some z and diverge for others. For a power series the situation is 
simple. The series (1) may converge in a disk with center z 0 or in the whole z-plane or 
only at Zo- We illustrate this with typical examples and then prove it. 

EXAMPLE 1 Convergence in a Disk. Geometric Series 

The geometric series 

2 z n = I + z + z 2 -f * * 4 

n=0 

converges absolutely if |zj < 1 and diverges if |z| S 1 (see Theorem 6 in Sec. 15.1). M 


EXAMPLE 2 Convergence for Every z 

The power series (which will be the Maclaurin series of / in Sec. 15.4) 

„2 ^3 

31 


2 - 1 + z+ ~ — + 


n-0 


Z_ 

21 
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is absolutely convergent for every z , In fact, by the ratio test, for any fixed z, 

M 


z n+1 /(n + 1)! 


z n in\ 


n 4- 1 


0 as n — > so. 


EXAMPLE 3 Convergence Only at the Center. (Useless Series) 

The following power series converges only at z = 0, but diverges for every z =£ 0, as we shall show. 


2 n\z n = 1 + z + 2z 2 + 6z 3 + 
«» 0 


In fact, from the ratio test we have 


(// + l)!z n 


nlz 


= (n + I ) |z| —► oc as n — > oo (z fixed and ^ 0). 


THEOREM 1 


Convergence of a Power Series 

(a) Every power series (1) converges at the center Zq. 

(b) If ( 1) converges at a point z = Z\i=- z 0 > it converges absolutely for every z 
closer to Zq than z x , that is, \z — Zq\ < \z± — ZqI* See Fig. 362. 

(c) If ( 1) diverges at a z — Z<z, it diverges for every z farther away from Zo 
than z 2 - See Fig. 362. 


Divergent 


/ Conv. 


V 1 

\ 


\ 

\ 

I 

t Z 2 


Fig. 362. Theroem 1 


PROOF (a) For z = z 0 the series reduces to the single term a 0 . 

(b) Convergence at z = Z\ gives by Theorem 3 in Sec. 15.1 a^Zi ~ Zo) n — » 0 as n—> <*>. 
This implies boundedness in absolute value, 

\a„(Zi - Zo)"| < M for every n = 0, 1, • • • . 


Multiplying and dividing a n (z — Zo) n by (z L — Zo) n we obtain from this 


a«(z - z 0 ) n \ = 

a n (z i ~ z 0 ) n ( ) 

HA 


\ z i “ Zo / 

1 


1 71 
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Summation over n gives 

(3) 2 Kte - Zo) n \ S M 2 2 ~ . 

n=l «= 1 Zl Z ° 

Now our assumption |z — z 0 l < l-i ~~ £ol implies that \(z — Zo)Kz\ ~ Zo)\ < 1. Hence the 
series on the right side of (3) is a converging geometric series (see Theorem 6 in 
Sec. 15.1). Absolute convergence of (1) as stated in (b) now follows by the comparison 
test in Sec. 15.1. 

(c) If this were false, we would have convergence at a z 3 farther away from Zq than z 2 . 
This would imply convergence at z 2 , by (b), a contradiction to our assumption of 
divergence at z 2 . ■ 

Radius of Convergence of a Power Series 

Convergence for every z (the nicest case, Example 2) or for no z & Zo (the useless case. 
Example 3) needs no further discussion, and we put these cases aside for a moment. We 
consider the smallest circle with center z 0 that includes all the points at which a given 
power series (1) converges. Let R denote its radius. The circle 

\z -Zo\=R (Fig. 363) 

is called the circle of convergence and its radius R the radius of convergence of (1). 
Theorem 1 then implies convergence everywhere within that circle, that is, for all z for 
which 

(4) | z — Zo\ < R 

(the open disk with center zq and radius /?). Also, since R is as small as possible, the series 
(1) diverges for all z for which 

(5) \z - z 0 \ > R- 

No general statements can be made about the convergence of a power series (1 ) on the 
circle of convergence itself. The series (I) may converge at some or all or none of these 
points. Details will not be essentia] to us. Hence a simple example may just give us the 
idea. 



Fig. 363. Circle of convergence 
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EXAMPLE 4 Behavior on the Circle of Convergence 

On the circle of convergence (radius R = 1 in all three series), 

2 z n /n 2 converges everywhere since 2 1 hi 2 converges, 

2 z n h\ converges at - 1 (by Leibniz’s test) but diverges at 1, 

2 z n diverges everywhere. ■ 

Notations R = and R = 0. To incorporate these two excluded cases in the present 
notation, we write 

R = }f the series (1) converges for all z (as in Example 2), 

R = 0 if (1) converges only at the center z = Zo (as in Example 3). 

These are convenient notations, but nothing else. 

Real Power Series. In this case in which powers, coefficients, and center are real, 
formula (4) gives the convergence interval \x - * 0 | < R of length 2 R on the real line. 

Determination of the Radius of Convergence from the Coefficients. For this 
important practical task we can use 


THEOREM 2 


Radius of Convergence R 

Suppose that the sequence |a n+1 /a w |, n = 1, 2, • * • , converges with limit L*. If 
L* = 0, then R = *>; that is, the power series (1) converges for all z. If L* ^ 0 
{hence L* > 0), then 


( 6 ) 


R = —r = lim 

L* w— >oo 


An 
a n + 1 


(Cauchy-Hadamard formula 1 ). 


If \a n+1 /a n \ — > co, then R = 0 {convergence only at the center z 0 ). 


PROOF For (1) the ratio of the terms in the ratio test (Sec. 15. 1 ) is 


«n+l(Z “ Zo) n+1 


a n + 1 

a n {z - z 0 ) n 




\z - Zo\. 


The limit is L = L*|z — zj- 


Let L* ¥= 0, thus L* > 0. We have convergence if L = L*|z — Zol < thus |z — Zol < !/£*> 
and divergence if |z — Zol > By (4) and (5) this shows that l/L* is the convergence 
radius and proves (6). 

If L* = 0, then L = 0 for every z, which gives convergence for all z by the ratio test. 
If \a n+1 /a n \ — > oo, then \ci n+1 /a n \\z - zol > 1 for any z ^ Zo and all sufficiently large n. 
This implies divergence for all z i=- Zo by the ratio test (Theorem 7, Sec. 15.1). ■ 


Earned after the French mathematicians A. L. CAUCHY (see Sec. 2.5) and JACQUES HADAMARD 
(1865-1963). Hadamard made basic contributions to the theory of power series and devoted his lifework to 
partial differential equations. 
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Formula (6) will not help if L* does not exist, but extensions of Theorem 2 are still 
possible, as we discuss in Example 6 below. 

EXAMPLE 5 Radius of Convergence 

“ (2/i) ! 

By (6) the radius of convergence of the power series 2 2 ~ 3/) n is 


r (2 n)! 

/ (2w + 2)! ■ 

- lim T (2 " )! 

((»+ DO 2 ' 

(» + I) 2 

L(«!) 2 / 

' ((« + l)!) 2 . 

n-o= l (2n + 2)! 

(h!) 2 . 

n^oc (2/i + 2)(2w + 1) 


The series converges in the open disk |s — 3 /| < J of radius \ and center 3 /. ■ 

EXAMPLE 6 Extension of Theorem 2 

Find the radius of convergence R of the power series 

| 0 [ l + , " ,> ' + F] I "- 3+ i z+ ( 2 + i) !! + ?‘- s+ ( 2+ T?) ! ‘ + -• 

Solution . The sequence of the ratios 1/6, 2(2 + 5), l/(8(2 + 5)), • • • does not converge, so that Theorem 
2 is of no help. It can be shown that 

(6*) R = ML. L = lim VjflJ. 

Hr-* OC 

This still does not help here, since {V |a n |} does not converge because V|aJ = V l/2 w = 1/2 for odd n, 
whereas for even n we have 

VjflJ = ^2 + i/2 n 1 as co, 

so that VjoJ has the two limit points 1/2 and 1 . It can further be shown that 


(6**) R - Ml, l the greatest limit point of the sequence 

Here T — l , so that R = l . Answer. The series converges for \z\ < 1 . 


{■Oti). 


Summary. Power series converge in an open circular disk or some even for every z (or 
some only at the center, but they are useless); for the radius of convergence, see (6) or 
Example 6. 

Except for the useless ones, power series have sums that are analytic functions (as we 
show in the next section); this accounts for their importance in complex analysis. 


-SET— 15 . 2 


\t(^ 


1. (Powers missing) Show that if 2 a n z n has radius of “ n\ 

convergence R (assumed finite), then 2 a n z 2n has radius Zj fe + 1) 

of convergence V/?. Give examples. n "° 

2. (Convergence behavior) Illustrate the facts shown by ^ / a Y* n 

Examples 1-3 by further examples of your own. 0 l^/ " 

3-18| RADIUS OF CONVERGENCE 

Find the center and the radius of convergence of the ^ 2 0* “ i) n z n 
following power series. (Show the details.) n ”° 

x ( 7 a - i\ n 30 /_nn+l 

3.2^^- 4.2^-fe + 20* U.2 1 -^— 

n=l n ?i=0 n! „=1 « 


*2 

n *=0 

* (-IT _ 2n 

„=o 2 2n («!) 2 4 

“SiTTif*-"’ 
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13. 2 n(n - \)(z - 3 + 2i) n 

n = 2 


14 y ... 

... <*'>! ‘ 

"■i.iirn-h-* 


18- S ^4 fe + *0* 


»=0 


2 n (/i!) 4 


IS. 2 2 n (z - i) 4 


17. y — - 2n 

A/ ^ 9?i “• 

71-0 


19. CAS PROJECT. Radius of Convergence. Write a 
program for computing R from (6), (6*), or (6**)> in 
this order, depending on the existence of the limits 
needed. Test the program on series of your choice and 


such that all three formulas (6), (6*), and (6**) will 
come up. 

20. TEAM PROJECT. Radius of Convergence, (a) 

Formula (6) for/? contains \a n !a n+l \, not 

How could you memorize this by using a qualitative 

argument? 

(b) Change of coefficients. What happens to 
R (0 < R < oo) if you (i) multiply all a n by k ¥* 0, 
(ii) multiply a n by k n =£ 0, (iii) replace a„ by l/n n ? 

(c) Example 6 extends Theorem 2 to nonconvergent 
cases of a w /fl n+l . Do you understand the principle of 
“mixing” by which Example 6 was obtained? Use this 
principle for making up further examples. 

(d) Does there exist a power series in powers of r that 
converges at z = 30 + 10/ and diverges at z = 31 — 6/? 
(Give reason.) 


15.3 Functions Given by Power Series 

The main goal of this section is to show that power series represent analytic functions 
(Theorem 5). Along our way we shall see that power series behave nicely under addition, 
multiplication, differentiation, and integration, which makes these series very useful in 
complex analysis. 

To simplify the formulas in this section, we take Zq = 0 and write 


(1) X a n z n . 

n — 0 

This is no restriction because a series in powers of z — Zo with any z 0 can always be 
reduced to the form (1) if we set z - Zo = z- 

Terminology and Notation. If any given power series (l) has a nonzero radius of 
convergence R (thus R > 0), its sum is a function of z, say f(z). Then we write 


( 2 ) 


0 c 

f(z) = 2 o n z n = a 0 + a x z + a 2 z 2 + • * • 


71 = 0 


(\z\ < R). 


We say that f(z) is represented by the power series or that it is developed in the power 
series. For instance, the geometric series represents the function f(z) = 1/(1 — z) in the 
interior of the unit circle \z\ = I. (See Theorem 6 in Sec. 15.1.) 

Uniqueness of a Power Series Representation. This is our next goal. It means that 
a function f(z) cannot be represented by two different power series with the same 
center. We claim that if f(z) can at all be developed in a power series with center Zq . the 
development is unique. This important fact is frequently used in complex analysis (as well 
as in calculus). We shall prove it in Theorem 2. The proof will follow from 
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THEOREM 1 Continuity of the Sum of a Power Series 

If a function f(z) can be represented by a power series (2) with radius of convergence 
R > 0, then f(z ) is continuous at z — 0. 


PROOF From (2) with z — 0 we have /( 0) = a 0 . Hence by the definition of continuity we must 
show that lirn^o f(z) = /( 0) = a 0 . That is, we must show that for a given e > 0 there 
is a 8 > 0 such that \z\ < 8 implies \f(z ) — a Q \ < e. Now (2) converges absolutely for 
|z| = r with any r such that 0 < r < R, by Theorem 1 in Sec. 15.2. Hence the series 

2 Kk" _1 = 7 E Kk” 

n=l n—X 

converges. Let 5 ¥> 0 be its sum. (5 = 0 is trivial.) Then for 0 < |z| = r, 

|/(z) - flol = E a nZ n = \z\ 2 kJ kr _1 = \z\ 2 = \z\s 

n= 1 n—X n= 1 

and \z\S < e when \z\ < 8 , where 8 > 0 is less than r and less than e/5. Hence 
\z\S < 8S < (e/5) 5 = e. This proves the theorem. ■ 

From this theorem we can now readily obtain the desired uniqueness theorem (again 
assuming zq = 0 without loss of generality): 


THEOREM 2 Identity Theorem for Power Series. Uniqueness 

Let the power series a$ -1- a x Z + a 2 z 2 4- • • • and b 0 + b±z + b 2 z 2 + • • • both be 
convergent for \z\ < /?, where R is positive , and let them both have the same sum for 
all these z. Then the series are identical that is, a 0 — b 0 , — b u a 2 = b 2 , • • • . 

Hence if a function f(z) can be represented by a power series with any center Zq, 
this representation is unique . 


PROOF We proceed by induction. By assumption, 

+ a x z + a 2 z 2 + • • • = -f b r z. 4- b 2 z 2 4 ■ • ■ • (\z\ < R). 

The sums of these two power series are continuous at z = 0, by Theorem 1 . Hence if we 
consider \z\ > 0 and let z — > 0 on both sides, we see that a Q = b 0 : the assertion is true 
for n = 0. Now assume that a n = b n for n = 0, 1, • • • , m. Then on both sides we may 
omit the terms that are equal and divide the result by z m+1 (# 0); this gives 

a m + 1 4“ a in + 2 z 4- flm+sZ 4" b m + 1 4- b m + 2 z 4* bm+^z 4 • • • . 


Similarly as before by letting z — > 0 we conclude from this that a m + x = b m+1 . This 
completes the proof. ■ 
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Operations on Power Series 

Interesting in itself, this discussion will serve as a preparation for our main goal, namely, 
to show that functions represented by power series are analytic. 

Termwise addition or subtraction of two power series with radii of convergence R x 
and R 2 yields a power series with radius of convergence at least equal to the smaller of 
/?! and R 2 . Proof. Add (or subtract) the partial sums s n and s* term by term and use 
lim (s n ± s*) = lim s n ± lim s*. 

Termwise multiplication of two power series 

00 

f(z) = 2 a kZ k = a 0 + a x z + ••• 

, k=0 

and 

oc 

g(z) = 2 b m z m = b 0 + b x z + • • • 

0 

means the multiplication of each term of the first series by each term of the second series 
and the collection of like powers of z. This gives a power series, which is called the 
Cauchy product of the two series and is given by 


a 0 b 0 4- (a 0 b x 4- ci x b 0 )z + ( a 0 b 2 + a x b x 4- a 2 b 0 )z 2 + • * • 

oc 

= 2 («o K + «!&„_! + • • • + a n b 0 )z n . 

71=0 

We mention without proof that this power series converges absolutely for each z within 
the circle of convergence of each of the two given series and has the sum s(z) = f(z)g(z). 
For a proof, see [D5] listed in App. 1. 

Termwise differentiation and integration of power series is permissible, as we show 
next. We call derived series of the power series (1) the power series obtained from (1) 
by termwise differentiation, that is, 

oc 

(3) 2 «a n z n-1 = fli + 2 a 2 z + 3 a 3 z 2 + • • • . 

71=1 


THEOREM 3 


Termwise Differentiation of a Power Series 

The derived series of a power series has the same radius of convergence as the 
original series. 


PROOF This follows from (6) in Sec. 1 5.2 because 


(n + 1) |o n+1 | 


= lim 


lim 


n— >oc n 4" 1 7i—^oo 


^71+1 


= lim 

n-* o© 




a n+\ 


or, if the limit does not exist, from (6**) in Sec. 15.2 by noting that — > 1 as n — » 
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EXAMPLE 1 Application of Theorem 3 

Find the radius of convergence R of the following series by applying Theorem 3. 

2 Q z n = z 2 + 3 z a + 6 z 4 + I0z s + • • • . 

Solution . Differentiate the geometric series twice term by term and multiply the result by £ 2 /2. This yields 
the given series. Hence R = 1 by Theorem 3. I 


THEOREM 4 


Termwise Integration of Power Series 

The power series 


71=0 


n + 1 


z n+l _ a ^ z + 


*1 

2 


z 2 + 


02 

3 


z 3 + 


obtained by integrating the series a 0 + a-^z + a 2 z 2 + • • • term by term has the 
same radios of convergence as the original series . 


The proof is similar to that of Theorem 3. 

With Theorem 3 as a tool, we are now ready to establish our main result in this section. 

Power Series Represent Analytic Functions 


THEOREM 5 


Analytic Functions. Their Derivatives 

A power series with a nonzero radius of convergence R represents an analytic 
function at every point interior to its circle of convergence. The derivatives of this 
function are obtained by differentiating the original series term by term. All the 
series thus obtained have the same radius of convergence as the original series. 
Hence, by the first statement, each of them represents an analytic function. 


PROOF (a) We consider any power series (1) with positive radius of convergence R. Let f(z) be 
its sum and f x (z) the sum of its derived series; thus 


oo oc 

(4) f(z) = 2 a n z n and /i(z) = 2 m n z n ~ x . 

0 7i—l 

We show that f{z) is analytic and has the derivative f\(z) in the interior of the circle of 
convergence. We do this by proving that for any fixed z with |z| < R and Az — » 0 the 
difference quotient [/(z + Az) — /(z)]/Az approaches fi(z). By termwise addition we first 
have from (4) 


(5) 


/(z + Az) - /(z) 
Az 


- fi(z) 


co r 

2 

n=2 L 


(z + Az) w - z” 
Az 


— «z n_1 . 


Note that the summation starts with 2, since the constant term drops out in taking the 
difference f(z + Az) — /(z), and so does the linear term when we subtract / x (z) from the 
difference quotient. 
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(b) We claim that the series in (5) can be written 

oc 

(6) 2 fl„.Az[(z + Az) n_2 + 2z(z + Az) n-3 + •••+(«- 2 )z m-3 (z + Az) 

n=2 + (i» - 1 )z’- 2 ]. 


The somewhat technical proof of this is given in App. 4. 

(c) We consider (6). The brackets contain n — 1 terms, and the largest coefficient is 
n — 1. Since ( n — l) 2 ^ n(n — 1), we see that for |z| = Ro and | z + Az| = /?o» < R* 

the absolute value of this series (6) cannot exceed 


(7) |Az| 2 kl«(« - l)/?o 2 

n« 2 

This series with a n instead of |« n | is the second derived series of (2) at z = Ro and converges 
absolutely by Theorem 3 of this section and Theorem 1 of Sec. 15.2. Hence our present 
series (7) converges. Let the sum of (7) (without the factor |Az|) be K(R 0 ). Since (6) is 
the right side of (5), our present result is 


f(z + A z) ~ f(z) 
A z 


- fi(z) 


^ |Az| K(R 0 ). 


Letting Az — » 0 and noting that R 0 (< R) is arbitrary, we conclude that /(z) is analytic at 
any point interior to the circle of convergence and its derivative is represented by the derived 
series. From this the statements about the higher derivatives follow by induction. ■ 


Summary. The results in this section show that power series are about as nice as we 
could hope for: we can differentiate and integrate them term by term (Theorems 3 and 4). 
Theorem 5 accounts for the great importance of power series in complex analysis: the 
sum of such a series (with a positive radius of convergence) is an analytic function and 
has derivatives of all orders, which thus in turn are analytic functions. But this is only 
part of the story. In the next section we show that, conversely, every given analytic function 
/(z) can be represented by power series, called Taylor series and being the complex 
analog of the real Taylor series of calculus. 




|l-l<)| RADIUS OF CONVERGENCE BY 

DIFFERENTIATION OR INTEGRATION 

Find the radius of convergence in two ways: (a) directly by 
the Cauchy-Hadamard formula in Sec. 15.2, (b) from a 
series of simpler terms by using Theorem 3 or Theorem 4. 


4.2 

n=0 


(-i r 

2 n + 1 



s-2 y,/,( ; + 0 


oc 

1.2 


n(n — 1) 
3* 


(Z ~ 2l) n 



2.2 

M=1 




n(n + I) 


7.2 

n-l 


n(n + !)(/; + 2) 


r 2n 


^ n 

( z + /)■ 


g V - 2W(2 ” JSn-2 

n“ 
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11. (Addition and subtraction) Write out the details of 
the proof on termwise addition and subtraction of 
power series. 

12. (Cauchy product) Show that 

(1 - z)~ 2 = 2*= 0 (n + l)z n (a) by using the Cauchy 
product, (b) by differentiating a suitable series. 

13. (Cauchy product) Show that the Cauchy product of 

multiplied by itself gives 2^ =0 (2 z) n lnl. 

14. (On Theorem 3) Prove that — > 1 as n — » oc (as 

claimed in the proof of Theorem 3). 

15. (On Theorems 3 and 4) Find further examples of your 
own. 

16-20 1 APPLICATIONS OF THE IDENTITY 
THEOREM 

State clearly and explicitly where and how you are using 
Theorem 2. 

16. (Bionomial coefficients) Using 

(I + z) p (l + z) q = (1 + z) p + q , obtain the basic 
relation 



17. (Odd function) If f(z) in (1) is odd (i.e., 

/(— z) = ~f(z)\ show that a n = 0 for even n. Give 
examples. 

18. (Even functions) If f(z) in (1) is even (i.e., 
/(— z) = /(z)), show that a n = 0 for odd n. Give 
examples. 

19. Find applications of Theorem 2 in differential equations 
and elsewhere 

20. TEAM PROJECT. Fibonacci numbers. 2 (a) The 
Fibonacci numbers are recursively defined by 
a 0 = a x = l, a v+1 = a n + a n - t if n = 1, 2, • • • . 
Find the limit of the sequence (a n + i/a n ). 

(b) Fibonacci’s rabbit problem. Compute a list of 
fli, • * * , tf 12 . Show that a 12 = 233 is the number of 
pairs of rabbits after 12 months if initially there is 1 
pair and each pair generates 1 pair per month, 
beginning in the second month of existence (no deaths 
occurring). 

(c) Generating function. Show that the generating 
function of the Fibonacci numbers is 

f(z) = 1/(1 — z — z 2 ); that is, if a power series (1) 
represents this /(z), its coefficients must be the 
Fibonacci numbers and conversely. Hint. Start from 
f(z) (1 — z — z 2 ) = 1 and use Theorem 2. 


15.4 Taylor and Maclaurin Series 

The Taylor series 3 of a function f(z), the complex analog of the real Taylor series is 

(1) f(z) = 2 a n (z - ZoT where a n = - - f (n \z 0 ) 

n\ 


or, by (1), Sec. 14.4, 

( 2 ) 



In (2) we integrate counterclockwise around a simple closed path C that contains zo in 
its interior and is such that f(z) is analytic in a domain containing C and every point 
inside C. 

A Maclaurin series 3 is a Taylor series with center Zq = 0. 


2 LE0NARD0 OF PISA, called FIBONACCI (= son of Bonaccio), about 1 180-1250, Italian mathematician, 
credited with the first renaissance of mathematics on Christian soil. 

3 BROOK TAYLOR (1685-1731), English mathematician who introduced real Taylor series. COLIN 
MACLAURIN (1698-1746), Scots mathematician, professor at Edinburgh. 



684 


CHAP. 15 Power Series, Taylor Series 


THEOREM 1 


PROOF 


( 3 ) 


The remainder of the Taylor series (1) after the term a n (z - z 0 ) n is 
(Z - Zo ) n+1 £ f(z*) 


Rn(z) = 


2m 


t 


(z* - z 0 ) n+1 (z* - z) 


dz* 


(proof below). Writing out the corresponding partial sum of (1), we thus have 


f(z) = f(.z 0 ) + - ., Z ° f'(z 0 ) + — ■ f(Zo) + 


( 4 ) 


1! 

(z - z 0 ) n 


nl 


2! 


f (n \zo) + Ruiz). 


This is called Taylor’s formula with remainder . 

We see that Taylor series are power series . From the last section we know that power 
series represent analytic functions. And we now show that every analytic function can be 
represented by power series, namely, by Taylor series (with various centers). This makes 
Taylor series very important in complex analysis. Indeed, they are more fundamental in 
complex analysis than their real counterparts are in calculus. 


Taylor's Theorem 

Let f(z) be analytic in a domain D, and let z - Zo be any point in D. Then there 
exists precisely one Taylor series (1) with center z 0 that represents f(z). This 
representation is valid in the largest open disk with center z 0 in which f(z) is analytic. 
The remainders R n (z) of ( 1) can be represented in the form (3). The coefficients 
satisfy the inequality 


( 5 ) 



where M is the maximum of \f(z)\ on a circle \z ~ Zo\ = r in D whose interior is 
also in D. 


The key tool is Cauchy’s integral formula in Sec. 14.3; writing z and z* instead of Zo and 
z (so that z* is the variable of integration), we have 


( 6 ) 


m = 



/(**) 
z* - z 


Jz*. 


z lies inside C, for which we take a circle of radius /• with center Zo and interior in D 
(Fig. 364). We develop l/(z* - z) in (6) in powers of z — z 0 * By a standard algebraic 
manipulation (worth remembering!) we first have 


J 


z* - z 


1 

z* - Z 0 - (z- Zo) 


1 


(z* - Zo) 



z - Zo \ 

Z* - Zo) 


(7) 
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For later use we note that since z* is on C while z is inside C, we have 


(7*) 



(Fig. 364). 



Fig. 364. Cauchy formula (6) 


To (7) we now apply the sum formula for a finite geometric sum 

1 - a n+l 1 a n+1 

(8*) 1 + q + • • • + q n = — = y (9 * 1), 

1 - <7 1 ~ q I “ q 


which we use in the form (take the last term to the other side and interchange sides) 

1 q n+1 

(8) = 1 +«+■■■ + q n 4- f . 

\ - q \ ~ q 

Applying this with q = (z — Zq)/(z * ~ Zq) to the right side of (7), we get 



We insert this into (6). Powers of z — Zo do not depend on the variable of integration z*, 
so that we may take them out from under the integral sign. This yields 


1 r f(z*) 

m = — f dz* + 

2 777 J c Z * - Zq 


£ - Zo 1 f(z*) 

2m ? c (z* - Zq) 1 2 


dz * + • • * 


• • • 4- 


(z - Z 0 ) n r /(z*) 

2m J c (z* - z 0 ) n+1 


dz * + R n (z) 


with R n (z) given by (3). The integrals are those in (2) related to the derivatives, so that 
we have proved the Taylor formula (4). 

Since analytic functions have derivatives of all orders, we can take n in (4) as large as 
we please. If we let n approach infinity, we obtain (1). Clearly, (I) will converge and 
represent f(z) if and only if 


lim R n (z) = 0. 

n-vcc 


( 9 ) 
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THEOREM 2 


PROOF 


We prove (9) as follows. Since z* lies on C, whereas z lies inside C (Fig. 364). we have 
|z* — z\ > 0. Since f(z) is analytic inside and on C, it is bounded, and so is the function 
/(z*)/(z* - <.), say, 



^ M 


for all z* on C. Also, C has the radius r = |z* — Zol and the length 2 tt 7\ Hence by the 
ML-inequality (Sec. 14.1) we obtain from (3) 


( 10 ) 


I* | = - Btl U d9 * 

1 nl 2 TT |j c (z* - z 0 )" +1 (z* - z) ~ 


ln-t-1 


27 T 




-0 


n+1 


Now |z — Zol < r because z lies inside C. Thus |z - Zo|/>‘ < 1, so that the right side 
approaches 0 as n — » 0°. This proves the convergence of the Taylor series. Uniqueness 
follows from Theorem 2 in the last section. Finally, (5) follows from (1) and the Cauchy 
inequality in Sec. 14.4. This proves Taylor’s theorem. ■ 


Accuracy of Approximation. We can achieve any preassinged accuracy in 
approximating /(z) by a partial sum of ( 1 ) by choosing n large enough. This is the practical 
aspect of formula (9). 


Singularity, Radius of Convergence. On the circle of convergence of (1) there is at 
least one singular point of /(z), that is, a point z = c at which /(z) is not analytic (but 
such that every disk with center c contains points at which /(z) is analytic). We also say 
that /(z) is singular at c or has a singularity at c. Hence the radius of convergence R of 
(1) is usually equal to the distance from z 0 to the nearest singular point of /(z). 

(Sometimes R can be greater than that distance: Ln z is singular on the negative real 
axis, whose distance from z 0 = — 1 + / is 1 , but the Taylor series of Ln z with center 
Zo = — 1 + / has radius of convergence V2.) 

Power Series as Taylor Series 

Taylor series are power series — of course! Conversely, we have 


Relation to the Last Section 

A power series with a nonzero radius of convergence is the Taylor series of its sum. 


Given the power series 

f(z) = a 0 + a x (z - Zo) + a 2 (z - z 0 ) 2 + a 3 (z ~ z 0 ) 3 + • • • . 

Then /(zo) = u 0 . By Theorem 5 in Sec. 15.3 we obtain 

f'(z) = <r/i + 2 a z (z - Zo) + 3 a 3 (z - z 0 f + • • • , thus f'(z 0 ) = flj 

f"(z) = 2a z + 3 • 2(z - Zq) + • • • , thus f"(zo) = 2 \a 2 
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EXAMPLE 1 


EXAMPLE 2 


and in general / (n) (z 0 ) = n\a n . With these coefficients the given series becomes the Taylor 
series of f(z ) with center Zq. ■ 


Comparison with Real Functions. One surprising property of complex analytic 
functions is that they have derivatives of all orders, and now we have discovered the other 
surprising property that they can always be represented by power series of the form (1). 
This is not true in general for real functions; there are real functions that have derivatives 
of all orders but cannot be represented by a power series. (Example: fix) = exp (—1/a* 2 ) 
if x # 0 and /( 0) = 0; this function cannot be represented by a Maclaurin series in an 
open disk with center 0 because all its derivatives at 0 are zero.) 


Important Special Taylor Series 

These are as in calculus, with x replaced by complex z . Can you see why? (Answer. The 
coefficient formulas are the same.) 

Geometric Series 

Let f(z) = 1/(1 - z). Then we have f n \z) = «!/(! - z) n+1 . / (n) (0) = w!. Hence the Maclaurin expansion of 
1/(1 - z) is the geometric series 


( 11 ) 



I + z + z 2 + * • • 


(\z\ < I). 


f(z) is singular at z — 1 ; this point lies on the circle of convergence. ■ 

Exponential Function 

We know that the exponential function e z (Sec. 13.5) is analytic for all z, and ( e z ) f = e*. Hence from (1) with 
Zq — 0 we obtain the Maclaurin series 


( 12 ) 



+ z + 



This series is also obtained if we replace jc in the familiar Maclaurin series of e x by z. 

Furthermore, by setting z = iy in (12) and separating die series into the real and imaginary parts (see 
Theorem 2, Sec. 15.1) we obtain 


00 (M n 

P = X ^4- 

«-o " ! 


oo .,2fc oo 

= 2 M) fc 75^7 +i~2 (-i ) k 

k =0 ^ K> - k =*0 


y 2fc + l 

(2k + 1)! ■ 


Since the series on the right are the familiar Maclaurin series of the real functions cosy and siny, this shows 
that we have rediscovered the Euler formula 


(13) 




= cos y + i sin y. 


Indeed, one may use (12) for defining <r and derive from (12) the basic properties of e z . For instance, the 
differentiation formula (<?)' = e z follows readily from (12) by termwise differentiation. ■ 
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EXAMPLE 3 


EXAMPLE 4 


EXAMPLE 5 


Trigonometric and Hyperbolic Functions 

By substituting (12) into (1) of Sec. 13.6 we obtain 


( 14 ) 


oc _2n 


z 2 z 4 

1- 2! + ^! “ + 


00 

sin z = 2 (~l) n 

71=0 


z 2n+1 


= 7 _il . i!_ + 

(2« + 1)! Z 3! 5! 


When z — x these are the familiar Maclaurin series of the real functions cos x and sin x. Similarly, by substituting 
(12) into (II). Sec. 13.6, we obtain 


( 15 ) 


ce r 2n 


T 2 - 4 

= 1 + 1\ + 1\ + 


sinh z = 2 

71 = 0 


z 2 ”* 1 Z 3 , z 5 

(2 n + 1)! “ Z + 3! + 5! 


Logarithm 

From (1) it follows that 

( 16 ) 


z 2 

Ln(i +*) = *- y + y - + 


Replacing z by ~z and multiplying both sides by —l, we get 


1 


-2 .3 


~Ln (1 — z) = Ln ■ = z + — + — + 


(17) 

By adding both series we obtain 

1+2 / 2 s z 5 \ 

(18) L„— -2(« +T + T+ ...) 


(|z| < 1). 


(W < i). 


(kl < i). 


Practical Methods 

The following examples show ways of obtaining Taylor series more quickly than by the 
use of the coefficient formulas. Regardless of the method used, the result will be the same. 
This follows from the uniqueness (see Theorem 1). 


Substitution 

Find the Maclaurin series of /(;) = 1/(1 + - 2 ). 
Solution. By substituting ~z 2 for z in (1 1) we obtain 


1 +z 2 


—37 = 2 (~^ n = 2 (- 1) V" = 1 - j 2 + z 4 - z 6 + • • • (|z| < i). 

' *• 1 n-0 


n«0 


( 19 ) 
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EXAMPLE 6 


EXAMPLE 


EXAMPLE 8 


Integration 

Find the Maclaurin series of f(z) — arctan z. 

Solution . We have /'(z) = 1/(1 + z 2 ). Integrating (19) term by term and using /( 0) = 0 we get 

co r— n n t 3 ? 5 

arctan z = 2 ~ n + x z 2n+1 = z- y + y- + *** (\z\ < 1); 

71=0 “ 

this series represents the principal value of w = u + iv = arctan z defined as that value for which 
|m| < tt/2. M 


Development by Using the Geometric Series 

Develop l/(c — z) in powers of z - Zo> where c — zq^ 0. 

Solution. This was done in the proof of Theorem 1, where c = z*. The beginning was simple algebra and 
then the use of (11) with z replaced by (z - z 0 )/(c - zq ): 


_1 1 

C - Z C — Zq — (z — Zq) 


This series converges for 


( c ~ Zq) 
1 

c-z 0 


1 

1 03 

1 V 1 

f z - Zo \ n 

i 

tv 

1 

C ~Z0 n= 0 

\c - Zq) 

\ C- Zq) 




1 

£ 

z - Zo \ 2 \ 

1 + + 

+ • • • 

\ C - Zo 

\c - Zo / / 


Z- Zq 
c- Z 0 


< I. 


that is, 


|z — Zol < k — Zol- 


Binomial Series, Reduction by Partial Fractions 

Find the Taylor series of the following function with center zo = 1. 

2z 2 + 9z + 5 


Solution. We develop /( z) in partial fractions and the first fraction in a binomial series 


( 20 ) 


1 

(1 + z) n 


= (l + zT 


-2 )z * 

n-0 \ n / 


m(m + 1) , m(m + l)(m + 2) , 

= 1 - mz + — Z — Z + 


2! 


3 ! 


with m = 2 and the second fraction in a geometric series, and then add the two series term by term. This gives 


f(z) = 


1 


1 


2 = I / I \ L_ 

2 - (z - 1) 9 \ [l + |(j — l)] 2 / 1 - i(z - 


(z + 2) 2 z - 3 [3 + (z - l)] 2 

■iic 


8 


31 


23 




108 


275 

1944 


(z ~ if ~ 


We see that the first series converges for |z - l| < 3 and the second for |z - l| < 2. This had to be expected 
because l/(z + 2) 2 is singular at -2 and 2/(z - 3) at 3. and these points have distance 3 and 2, respectively, 
from the center zq = 1 . Hence the whole series converges for \z - 1 1 < 2. ■ 
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TAYLOR AND MACLAURIN SERIES 


Find the Taylor or Maclaurin series of the given function 
with the given point as center and determine the radius of 
convergence. 


1. <r 2 \ 0 


2 . 1/(1 - z 3 ), 0 


3. e z , — 2 i 


4. cos 2 z, 0 


5. sinz, Tr/2 


6 . 1 /z, 1 


7. 1/(1 - z), i 
9. e ~ A2 , 0 

11 . z 6 - z 4 + z 2 - 1 , 


8 . Ln (1 - z), i 

10 . e * 2 f e-‘‘ dt, 0 
•'o 

12 . sinh(z - 2 /), 2 / 


13-16 


HIGHER TRANSCENDENTAL 
FUNCTIONS 


Find the Maclaurin series by tennwise integrating the 
integrand. (The integrals cannot be evaluated by the usual 
methods of calculus. They define the error function erf z, 
sine integral $i(z), and Fresnel integrals 4 S(z) and C(z), 
which occur in statistics, heat conduction, optics, and other 
applications. These are special so-called higher 
transcendental functions.) 


13. erf z 




f" sin t 

14. Si(z) = dt 

J o t 

16. C(z) = f cos t 2 dt 


defines the Bernoulli numbers B n . Using undetermined 
coefficients, show that 


(23) 




B 2 , ♦ B 3 — 0 , 
o 

B 5 = 0, B 6 . 


Write a program for computing B n . 

(c) Tangent. Using (1), (2), Sec. 13.6, and ( 22 ), show 
that tanz has the following Maclaurin series and 
calculate from it a table of B 0> • • • . B 2 o* 


(24) tan z = 


2/ 4/ 

e 2iz - 1 e 4iz - 1 


— 1 


^2n/^2n 1 \ 

- V /_ I V 1- 1 ~ ; d -2n-l 

-2,( n (2;|)! B Zn , . 


18. (Inverse sine) Developing 1 /V 1 — z 2 and integrating, 
show that 


arcsin z = z + 


m)T- 


5 

(M < D- 


Show that this series represents the principal value of 
arcsin z (defined in Team Project 30, Sec. 13.7). 


17. CAS PROJECT, sec, tan, arcsin. (a) Euler numbers. 
The Maclaurin series 


( 21 ) 


sec z = E 0 — 


h. 

2 ! 


z 2 + 



defines the Euler numbers E 2n . Show that E 0 = 1, 
E 2 = -1, £4 = 5, E g = -61. Write a program that 
computes the £ 2 » from the coefficient formula in ( 1 ) 
or extracts them as a list from the series. (For tables 
see Ref. [GR1], p. 810, listed in App. 1.) 

(b) Bernoulli numbers. The Maclaurin series 


( 22 ) 



= 1 + B lZ + ^ Z 3 


h 

21 


19. (Undetermined coefficients) Using the relation 
sin z = tan z cos z and the Maclaurin series of sin z and 
cos z, find the first four nonzero terms of the Maclaurin 
series of tan z. (Show the details.) 

20. TEAM PROJECT. Properties from Maclaurin 
Series. Clearly, from series we can compute function 
values. In this project we show that properties of 
functions can often be discovered from their Taylor or 
Maclaurin series. Using suitable series, prove the 
following. 

(a) The formulas for the derivatives of e z y cos z, sin z, 
cosh z, sinh z, and Ln ( 1 -I- z) 

(b) \{e lz - 1 - e~ lz ) = cos z 

(c) sin z ^ 0 for all pure imaginary z = iy £ 0 


4 AUGUSTIN FRESNEL (1788-1827), French physicist and engineer, known for his work in optics. 
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15.5 Uniform Convergence. Optional 

We know that power series are absolutely convergent (Sec. 15.2, Theorem 1) and, as 
another basic property, we now show that they are uniformly convergent Since uniform 
convergence is of general importance, for instance, in connection with termwise integration 
of series, we shall discuss it quite thoroughly. 

To define uniform convergence, we consider a series whose terms are any complex 
functions f 0 (z ), fi(z\ *** : 


(1) 2 fm(z) = foil) + h(z) + f 2 (z) + ■■■. 

171-0 

(This includes power series as a special case in which f m (z) = a m ( z — 4 o) m .) We assume 
that the series (1) converges for all z in some region G. We call its sum s(z) and its nth 
partial sum .? n (z); thus 


Sn(z) = foil ) + fi(l) + • • ' + f n (z). 

Convergence in G means the following. If we pick a z = Z\ in G, then, by the definition 
of convergence at n* for given e>0we can find an N^e) such that 

k(*i) ~ s n (zi) | < e for all n > N^e). 

If we pick a z 2 in G, keeping € as before, we can find an N 2 (e) such that 

\s(z 2 ) “ s n (z 2 ) | < e for all n > W 2 (e), 

and so on. Hence, given an e > 0, to each z in G there corresponds a number N z (e). 
This number tells us how many terms we need (what s n we need) at a z to make 
kU) “ **n(z)l smaller than e. Thus this number N z (e) measures the speed of 
convergence. 

Small N z (e) means rapid convergence, large N z (e) means slow convergence at the point 
z considered. Now, if we can find an N(e) larger than all these N z (e) for all z in G, we 
say that the convergence of the series (1) in G is uniform. Hence this basic concept is 
defined as follows. 


DEFINITION 


Uniform Convergence 

A series (1) with sum s(z) is called uniformly convergent in a region G if for every 
e > 0 we can find an N = N(e), not depending on z, such that 


\s(z) - s n (z ) | < e for all n > N(e) and all z in G. 


Uniformity of convergence is thus a property that always refers to an infinite set in 
the z-plane, that is, a set consisting of infinitely many points. 
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EXAMPLE 1 Geometric Series 

Show that the geometric series 1 + z + z 2 + • * * is (a) uniformly convergent in any closed disk \z\ ^ r < 1, 
(b) not uniformly convergent in its whole disk of convergence |z| < 1. 

Solution . (a) For z in that closed disk we have |l - z\ ^ 1 — r (sketch it). This implies that 

l/|l — z| = 1/(1 - r). Hence (remember (8) in Sec. 15.4 with q = z) 


\s(z ) - S n (z) I = 


00 


1 

2 z m 




I 7 

m=n+ 1 


• 4 


tt+ 1 


Since r < 1 , we can make the right side as small as we want by choosing n large enough, and since the right 
side does not depend on z (in the closed disk considered), this means that the convergence is uniform. 

(b) For given real K (no matter how large) and n we can always find a z in the disk |z| < 1 such that 


1 - z 


kr . 

\l-z\ > 


simply by taking z close enough to 1. Hence no single Me) will suffice to make |s(z) - ,v n (z)| smaller than a 
given € > 0 throughout the whole disk. By definition, this shows that the convergence of the geometric series 
in \z\ < 1 is not uniform. H 


This example suggests that for a power series , the uniformity of convergence may at most 
be disturbed near the circle of convergence. This is true: 


THEOREM 1 


Uniform Convergence of Power Series 

A power series 

oc 

( 2 ) 2 ~ Zo) m 

m - 0 

with a nonzero radius of convergence R is uniformly convergent in every circular 
disk | z — Zol = r °f radius r < R. 


PROOF For | z — Zo\ = r and any positive integers n and p we have 

(3) \a n+1 (z - Zo) n+1 + • ‘ • + W z - z 0 ) n+p l ^ K+ik n+1 + • * ’ + k+ P k n+p 

Now (2) converges absolutely if \z — Zol = r < R (by Theorem 1 in Sec. 15.2). Hence it 
follows from the Cauchy convergence principle (Sec. 15.1) that, an e > 0 being given, 
we can find an N(e) such that 

|an+ik n+1 + “ “ " + kn+ P k n+p < € for n > N(e) and p = 1, 2, • • • . 
From this and (3) we obtain 


k+lfc - 2o)” +1 + • • • + Cl n+p (z - Z 0 ) n+P \ < € 

for all z in the disk \z — z 0 | = every n > N(e), and every p — 1, 2, • • • . Since N(e) is 
independent of z, this shows uniform convergence, and the theorem is proved. H 

Theorem 1 meets with our immediate need and concern, which is power series. The 
remainder of this section should provide a deeper understanding of the concept of unif orm 
convergence in connection with arbitrary series of variable terms. 
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THEOREM 2 


PROOF 


Properties of Uniformly Convergent Series 

Uniform convergence derives its main importance from two facts: 

1. If a series of continuous terms is uniformly convergent, its sum is also continuous 
(Theorem 2, below). 

2. Under the same assumptions, termwise integration is permissible (Theorem 3). 
This raises two questions: 

1. How can a converging series of continuous terms manage to have a discontinuous 
sum? (Example 2) 

2. How can something go wrong in termwise integration? (Example 3) 

Another natural question is: 

3. What is the relation between absolute convergence and uniform convergence? The 
surprising answer: none. (Example 5) 

These are the ideas we shall discuss. 

If we add finitely many continuous functions, we get a continuous function as their sum. 
Example 2 will show that this is no longer true for an infinite series, even if it converges 
absolutely. However, if it converges uniformly ; this cannot happen, as follows. 


Continuity of the Sum 

Let the series 

CO 

2 = f 0 (z) + fl(z) + * • • 

w=0 

be uniformly convergent in a region G. Let F(z) be its sum. Then if each term f m (z) 
is continuous at a point Zi in G, the function F(z) is continuous at Z\> 


Let s n (z) be the nth partial sum of the series and R n (z) the corresponding remainder: 

/o + /l + * “b fn * & n /n+1 “b fn+ 2 ”b * * * * 

Since the series converges uniformly, for a given € > 0 we can find an N = N(e) such 
that 

\Rn(z)\ < y for all z in G. 

Since s N (z ) is a sum of finitely many functions that are continuous at Z\, this sum is 
continuous at z a . Therefore, we can find a 8 > 0 such that 

|%te) “ %(2i)| < y for all z in G for which | z — Zi| < 5. 

Using F = s N + R n and the triangle inequality (Sec. 13.2), for these z we thus obtain 
|F(z) - F(zj)| = |%(z) + R n (z) - [%(z x ) + /? N (z,)]| 

= I j n(z) - %(2i)| + |Fjv(z)| + |F n (zi)| <~ + y + y = e, 

This implies that F(z) is continuous at z lt and the theorem is proved. ■ 
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EXAMPLE 2 Series of Continuous Terms with a Discontinuous Sum 

Consider the series 

J2 


,.2 


X X 

X* + 5- + 


1 + xf 


(I + A- 2 ) 2 (1 -r X 2 ) 3 


This is a geometric series with q = 1/(1 + x 2 ) times a factor x 2 . Its nth partial sum is 

2 r i i ii 

H L l + A’ 2 (1+A- 2 ) 2 (1 + .V 2 ) n J 


We now use the trick by which one finds the sum of a geometric series, namely, we multiply 
S n (x) by -<?=-!/(!+ A 


1 2 T 1 

5* s n (*) = -a* 

1 + A 2 n L 1 + J 


+ 


1 


1 


(i + .v 2 r (i + x 2 r 


+i • 


(jc real). 


Adding this to the previous formula, simplifying on the left, and canceling most terms on the right, we obtain 

..2 


thus 


a 2 2 r i 

+a- 2 Sn(x) ~ x (i + .vY +i . ■ 


SnW = I + x 


(1 + x 2 ) n ' 


The exciting Fig. 365 “explains” what is going on. We see that if x =£ 0, the sum is 

s(x) = lim s n (x) = 1 + a- 2 . 

)l — »cc 

but for a* = 0 we have j? n (0) = 1 — 1=0 for all n, hence .v(0) = 0. So we have the surprising fact that the 
sum is discontinuous (at x = 0), although all the terms are continuous and the series converges even absolutely 
(its terms are nonnegative, thus equal to their absolute value!). 

Theorem 2 now tells us that the convergence cannot be uniform in an interval containing x - 0. We can also 
verify this directly. Indeed, for x =£ 0 the remainder has the absolute value 

I^n(- T )l “ k(- v ) ~ i n( A ‘)| = ^2yrt 

and we see that for a given e (< 1 ) we cannot find an N depending only on € such that < e for all n > N(e) 
and all a\ say, in the interval 0 ^ a £i I. ■ 



Fig. 365. Partial sums in Example 2 
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EXAMPLE 3 


THEOREM 3 


Termwise Integration 

This is our second topic in connection with uniform convergence, and we begin with an 
example to become aware of the danger of just blindly integrating term-by-term. 

Series for which Termwise Integration is Not Permissible 

Let M m ( x) = mxe and consider the series 

o© 

2 /mW where f m {x) = u^x) - 
m—0 


in the interval 0 ^ .v 1. The nth partial sum is 

S n = «! - «o + H 2 -«! + ••• + «„ - M n _! = tin - «0 = «»• 

Hence the series has the sum F(x) = lim s w (a) = lim u n (x) = 0 (0 = x ^ 1). From this we obtain 

n — »oo ti— 


I' 


F(x) dx = 0. 


On the other hand ¥ by integrating term by term and using f x + / 2 + * • * + f n = s n , we have 

CO f l 

s J 

t J n 


Jn Tl a/n 

m=l 0 m= 1 O 

Now 5 n = n w and the expression on the right becomes 

A 


.1 n -i u 

f m (x) dx = lim 2 I /»W dx = lim s n (x) dx. 

ti—*oQ j n n—*cx> 


lim f u n (x) dx = lim f nxe ^ dx = lim (1 - e n ) = — » 

Ti— *oo Jq n— k» Jq n—foo 2 2 

but not 0. This shows that the series under consideration cannot be integrated term by term from x — 0 to 
x«l. ■ 


The series in Example 3 is not uniformly convergent in the interval of integration, and 
we shall now prove that in the case of a uniformly convergent series of continuous 
functions we may integrate term by term. 


Termwise Integration 

Let 

oo 

F(Z) = 2 /»ft) = /oft) + Aft) + * • • 

m=0 

be a uniformly convergent series of continuous functions in a region G. Let C be 
any path in G. Then the series 

(4) 2 / fmiz) dz = f /o(z) dz+ f a ft) dz + ■ ■ ■ 

m- 0 c C J C 

is convergent and has the sum I F(z) dz . 

J c 
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PROOF From Theorem 2 it follows that F(z ) is continuous. Let s n (z) be the nth partial sum of the 
given series and R n {z) the corresponding remainder. Then F = s n 4 R n and by integration. 


J F(z) dz = J s n (z) dz + j R n (z) dz . 


Let L be the length of C. Since the given series converges uniformly, for every given 
€ > 0 we can find a number N such that |/? n (z)| < dL for all n > N and all z in G. By 
applying the A/L-inequality (Sec. 14.1) we thus obtain 


I L 


R n (z) dz 


< I L = e 


for all n> N. 


Since R n = F — s n , this means that 


| J4(z) dz - f c *n(z) dz 


< e 


for all n > AT. 


J c J c 

Hence, the series (4) converges and has the sum indicated in the theorem. 


Theorems 2 and 3 characterize the two most important properties of uniformly convergent 
series. Also, since differentiation and integration are inverse processes. Theorem 3 implies 


THEOREM 4 


Termwise Differentiation 

Let the series f 0 (z) + f\(z) + f z (z) + • • • be convergent in a region G and let F(z) 
be its sum. Suppose that the series fo(z) 4* f[(z) + f z (z) + * • • converges uniformly 
in G and its tertns are continuous in G. Then 

F\z) = /o(z) + fi(z) + fkz) 4 • • • for all z in G . 


Test for Uniform Convergence 

Uniform convergence is usually proved by the following comparison test. 


THEOREM 5 


Weierstrass 5 M-Test for Uniform Convergence 

Consider a series of the form (1) in a region G of the z-plane . Suppose tha t one can 
find a convergent series of constant terms , 

(5) M 0 4 M x + M 2 4 • - • , 

such that |/ m (z)| = M m for all z in G and every m = 0, 1, * • • . Then (1) is 
uniformly convergent in G. 


5 KARL WEIERSTRASS (1815-1897), great German mathematician, whose lifework was the development 
of complex analysis based on the concept of power series (see the footnote in Sec. 13.4). He also made basic 
contributions to the calculus, the calculus of variations, approximation theory, and differential geometry. He 
obtained the concept of uniform convergence in 1841 (published 1894, sic!): the first publication on the concept 
was by G, G. STOKES (see Sec 10.9) in 1847. 
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The simple proof is left to the student (Team Project 18). 

EXAMPLE 4 Weierstrass M-Test 

Does the following series converge uniformly in the disk \z\ ^ 1? 

« 1 
mL 1 '” 2 + cosh 

Solution . Uniform convergence follows by the Weierstrass M - test and tlie convergence of Sl/w 2 (see 
Sec. 15.1, in the proof of Theorem 8) because 


m + cosh mid 


vr + » 
2 


No Relation Between Absolute and 
Uniform Convergence 

We finally show the surprising fact that there are series that converge absolutely but not 
uniformly, and otheis that converge uniformly but not absolutely, so that there is no 
relation between the two concepts. 

EXAMPLE 5 No Relation Between Absolute and Uniform Convergence 

The series in Example 2 converges absolutely but not uniformly, as we have shown. On the other hand, the series 
oc (— t) m— i 1 1 1 

^ 2, — 2,i “ 2 ,o t • • • (.x real) 

m=1 x + m x +1 x + 2 x + 3 

converges uniformly on the whole real line but not absolutely. 

Proof. By the familkir Leibniz test of calculus (see App. A3.3) the remainder R n does not exceed its first 
term in absolute value, since we have a series of alternating terms whose absolute values form a monotone 
decreasing sequence with limit zero. Hence given e > 0, for all x we have 


KWl S a 1 < 7 < e i 

x f* /? + I n 

This proves uniform convergence, since /V(e) does not depend on jc. 
The convergence is not absolute because for any fixed x we have 


if n > N(e) ^ - . 


where k is a suitable constant, and k2lim diverges. 
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4.2 


Zl «(« + 1 ) 


, \z\ £ 10 s 


5-2 


£ 1 


6 . 2 


h 2 cosh w|z| 


dsi 


7. S |,| s 

n=0 

8. £ , |z| S 10 20 

„ n 
n=l 


9-16 


POWER SERIES 


Find the region of uniform convergence. (Give reason.) 


9 . i (s + 1 - 2<r 

n=0 


4 » 


z, 2 „ n - 


n=l 


CO . 

,3.2^- 

n=l 

ao J2n 

15- ^ 5V 


io. 2 


12.2 


(z ~ O' 
(2«)! 


i2n 


(2z - /)" 


14. 2 (3 re tanh n)z 2n 

n=l 

(-l) n z 2n 


16. 2 


n=0 


( 2 / 0 ! 


17. CAS PROJECT. Graphs of Partial Sums, (a) Figure 
365. Produce this exciting figure using your software 
and adding further curves, say, those of $ 1024 * etc - 
(b) Power series. Study the nonuniformity of 
convergence experimentally by plotting partial sums near 
the endpoints of the convergence interval for real z = x. 

18. TEAM PROJECT. Uniform Convergence. 

(a) Weierstrass M-test. Give a proof. 

(b) Termwise differentiation. Derive Theorem 4 
from Theorem 3. 

(c) Subregions. Prove that uniform convergence of a 
series in a region G implies uniform convergence in 
any portion of G. Is the converse true? 


(d) Example 2. Find the precise region of 
convergence of the series in Example 2 with x replaced 
by a complex variable z . 

(e) Figure 366. Show that x 2 (1 + -v 2 )“ m = l 
if a* ^ 0 and 0 if a* = 0. Verify by computation that the 
partial sums j 1# s 2 * look as shown in Fig. 366. 



Fig. 366. Sum s and partial 
sums in Team Project 18(e) 


19-20 


HEAT EQUATION 


Show that (9) in Sec. 12.5 with coefficients (10) is a solution 
of the heat equation for r > 0, assuming that f(x) is continuous 
on the interval 0 ^ x ^ L and has one-sided derivatives at 
all interior points of that interval. Proceed as follows. 


19. Show that |fl n | is bounded, say \B n \ < K for all n. 
Conclude that 


|h„| < if / S /„ > 0 


and, by the Weierstrass test, the series (9) converges 
uniformly with respect to x and / for t ^ t 0 , 0 ^ x L. 
Using Theorem 2, show that m(.v, t) is continuous for 
t ^ t 0 and thus satisfies the boundary conditions (2) 
for t ^ / 0 . 

20. Show that \du n ldt\ < A n 2 Ke~ x »* to if / ^ / 0 and the 
series of the expressions on the right converges, by the 
ratio test. Conclude from this, the Weierstrass test, and 
Theorem 4 that the series (9) can be differentiated term 
by term with respect to t and the resulting series has 
the sum dufdt. Show that (9) can be differentiated twice 
with respect to a* and the resulting series has the sum 
d 2 w/a* 2 . Conclude from this and the result to Prob. 19 
that (9) is a solution of the heat equation for all 
t ^ r 0 . (The proof that (9) satisfies the given initial 
condition can be found in Ref. [CIO] listed in App. 1.) 






TIONS AND PROBLEMS 


1. What are power series? Why are these series very 
important in complex analysis? 

2. State from memory the ratio test, the root test, and the 
Cauchy-Hadamard formula for the radius of 
convergence. 

3. What is absolute convergence? Conditional convergence? 
Uniform convergence? 


4. What do you know about the convergence of power 
series? 

5. What is a Taylor series? What was the idea of obtaining 
it from Cauchy’s integral formula? 

6. Give examples of practical methods for obtaining 
Taylor series. 

7. What have power series to do with analytic functions? 
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8. Can properties of functions be discovered from their 
Maclaurin series? If so, give examples. 

9. Make a list of Maclaurin series of <? s , cosz, sinz, 
cosh z, sinh z, Ln ( 1 — z) from memory. 

10. What do you know about adding and multiplying power 
series? 


11-20 


RADIUS OF CONVERGENCE 


Find the radius of convergence. Can you identify the sum 
as a familiar function in some of the problems? (Show the 
details of your work.) 


ii S (3z) ” 
£ -■ 

cc 

12. 2 
n®l 

so _2n-i- 1 

so 

14. 2 

?i= s 0 

15. 54ft- 3/ ) 2n 

oc 

16. 2 

n«* 1 

71-0 

17. 5 - 2/) 2 “ 

oo 

18. 2 

n«*>0 

n=0 

go 3 n 

2 w io 

n-1 

GO 

20. 2 
7J.-0 


(— 2) n+1 


(2 it)! 

(“D* 

(2 n + 1)! 
(2z) 2w 


n 


ft - 2 f n+1 


(2 n)! 

ft ~ iT 

(3 + 4/) n 


21-30 


TAYLOR AND MACLAURIN SERIES 


Find the Taylor or Maclaurin series with the given point as 
center and determine the radius of convergence. (Show 
details.) 


21. 777 

23. 1/(1 -z), “1 
25. 1/(1 - z) 3 , 0 

27. 1/z, -/ 

29. cos z, \tt 


22. Ln z, 2 

24. 1/(4 - 3z), 1 + / 

26. 1/z 2 , i 



30. $in 2 z. 


- 1 ) du 
0 


0 


31. Does every function /(z) have a Taylor series? 

32. Does there exist a Taylor series in powers of z - 1 — i 
that diverges at 5 -1- 5 i but converges at 4 + 6/? 

33. Do we obtain an analytic function if we replace x by z 
in the Maclaurin series of a real function /(*)? 

34. Using Maclaurin series, show that if /(z) is even, its 
integral (with a suitable constant of integration) is 
odd. 

35. Obtain the first few terms of the Maclaurin series of 
tan z by using the Cauchy product and 

sin z = cos z tan z. 


V 

‘ * M 




S3 z IS 5 2 IS 5? ] 


Power Series, Taylor Series 


Sequences, series, and convergence tests are discussed in Sec. 15.1. A power series 
is of the form (Sec. 15.2) 

cc 

(1) 2 - z 0 ) n = «o + a i(z - zo) + a 2 (z - Zo) 2 + • • • ; 

n=0 

Zo is its center. The series (1) converges for |z — Zol < R and diverges for 
| z — z 0 | > A, where R is the radius of convergence. Some power series converge 
for all z (then we write R = °°). In exceptional cases a power series may converge 
only at the center; such a series is practically useless. Also, R = lim \a n la n+1 \ if this 
limit exists. The series (1) converges absolutely (Sec. 15.2) and uniformly 
(Sec. 15.5) in every closed disk |z - Zol = r < R (R > 0). It represents an analytic 
function /(z) for |z - z 0 | < R. The derivatives /'(z), ■ • ■ are obtained by 

termwise differentiation of (1), and these series have the same radius of convergence 
R as (1). See Sec. 15.3. 
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Conversely, every analytic function f(z) can be represented by power series. These 
Taylor series of f(z) are of the form (Sec. 15.4) 

(2) f(z) = 2 4 / (n> (2o)U- - Zo) n (I Z -Zo\< R), 

n=0 nl 

as in calculus. They converge for all z in the open disk with center Zo aod radius 
generally equal to the distance from z 0 to the nearest singularity of f(z) (point at 
which f(z) ceases to be analytic as defined in Sec. 15.4). If f(z) is entire (analytic 
for all z; see Sec. 13.5), then (2) converges for all z. The functions e z , cos z, sin z, 
etc. have Maclaurin series, that is, Taylor series with center 0, similar to those in 
calculus (Sec. 15.4). 





CHAPTER 1 6 

Laurent Series. 
Residue Integration 


Laurent series generalize Taylor series. Indeed, whereas a Taylor series has positive integer 
powers (and a constant term) and converges in a disk, a Laurent series (Sec. 16.1) is a 
series of positive and negative integer powers of z — Zo and converges in an annulus (a 
circular ring) with center z 0 . Hence by a Laurent series we can represent a given function 
f(z ) that is analytic in an annulus and may have singularities outside the ring as well as 
in the “hole” of the annulus. 

We know that for a given function the Taylor series with a given center Zq is unique. 
We shall see that, in contrast, a function f(z) can have several Laurent series with the 
same center z 0 and valid in several concentric annuli. The most important of these series 
is that which converges for 0 < \z — Zo\ < R, that is, everywhere near the center z 0 except 
at Zo itself, where Zq is a singular point of /(z). The series (or finite sum) of the negative 
powers of this Laurent series is called the principal part of the singularity of /(z) at z 0 , 
and is used to classify this singularity (Sec. 16.2). The coefficient of the power 1 /(z — Zq) 
of this series is called the residue of /(z) at Zo- Residues are used in an elegant and 
powerful integration method, called residue integration, for complex contour integrals 
(Sec. 16.3) as well as for certain complicated real integrals (Sec. 16.4). 

Prerequisite : Chaps. 13, 14, Sec. 15.2. 

Sections that may be omitted in a shorter course: 16.2, 16.4. 

References and Answers to Problems: App. 1. Part D, App. 2. 


16.1 Laurent Series 

Laurent series generalize Taylor series. If in an application we want to develop a function 
/(z) in powers of z _ Zo when /(z) is singular at zo (as defined in Sec. 15.4), we cannot 
use a Taylor series. Instead we may use a new kind of series, called Laurent series, 1 
consisting of positive integer powers of z — Zo (and a constant) as well as negative integer 
powers of z — Zo; this is the new feature. 

Laurent series are also used for classifying singularities (Sec. 16.2) and in a powerful 
integration method (“residue integration”. Sec. 16.3). 

A Laurent series of /(z) converges in an annulus (in the “hole” of which /(z) may have 
singularities), as follows. 


1 PLERRE ALPHONSE LAURENT (1813-1854), French military engineer and mathematician, published the 
theorem in 1843. 
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CHAP. 16 Laurent Series. Residue Integration 


THEOREM 1 


Laurent's Theorem 

Let f(z) be analytic in a domain containing two concentric circles C a and C 2 with 
center z 0 and the annulus between them (blue in Fig. 367). Then f(z ) can be 
represented by the Laurent series 

f(z) = 2 on(z - zo) n + 2 zr ■>« 

n= 0 n=l Z °' 

(1) = a 0 + a x (z - z 0 ) + a 2 (z - z 0 ) 2 + * * * 

b-t b 2 

. . . ^ 1 1 ± h • • • 

z - z 0 (z - Zo) 2 

consisting of nonnegative and negative powers . The coefficients of this Laurent series 
are given by the integrals 

1 r f(z*) 1 r 

(2) = 2^7 i (T* - ^ *• - 2* * c (z * " ^ /(z * } *•* 

taken counterclockwise around any simple closed path C that lies in the annulus 
and encircles the inner circle , as in Fig. 367. [The variable of integration is denoted 
by z* since z is used in (1).] 

This series converges and represents f(z) in the enlarged open annulus obtained 
from the given annulus by continuously increasing the outer circle C 1 and decreasing 
C 2 until each of the two circles reaches a point where f(z) is singular . 

In the important special case that Zo is the only singular point of f(z) inside C 2j 
this circle can be shrunk to the point Zo> giving convergence in a disk except at the 
center. In this case the series (or finite sum) of the negative powers of( 1) is called 
the principal part of the singularity of f(z) at z 0 . 


C, 



Fig. 367. Laurent’s theorem 

COMMENT. Obviously, instead of (1), (2) we may write (denoting b n by a_ n ) 

00 

(!') f(z) = 2 On(Z- ZoY 1 


n -— oo 
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where all the coefficients are now given by a single integral formula, namely, 

( 2’> - -h i V -Cr*‘ ** 


PROOF We prove Laurent’s theorem, (a) The nonnegative powers are those of a Taylor series. 

To see this, we use Cauchy’s integral formula (3) in Sec. 14.3 with z* (instead of z ) as 
the variable of integration and z instead of Zo- Let g(z) and h(z) denote the functions 
represented by the two terms in (3), Sec. 14.3. Then 


(3) 


1 r f(z*) 1 f 

m = *(z) + Kz) = — f dz* - — f 

2m J c z* ~ z 2m J c 


f(z*) 


dz 


Here z is any point in the given annulus and we integrate counterclockwise over both C x 
and C 2 , so that the minus sign appears since in (3) of Sec. 14.3 the integration over C 2 is 
taken clockwise. We transform each of these two integrals as in Sec. 15.4. The first integral 
is precisely as in Sec. 15.4. Hence we get precisely the same result, namely, the Taylor 
series of g(z ), 


(4) 


g(z) = j> { (4 ) dz* = 2 a n (z - z 0 ) n 
2m z* ~ z 


n= 0 


with coefficients [see (2), Sec. 15.4, counterclockwise integration] 


(5) 


-M 


Kz*) 

(z* - z 0 r +1 


dz*. 


Here we can replace C x by C (see Fig. 367), by the principle of deformation of path, since 
z 0 , the point where the integrand in (5) is not analytic, is not a point of the annulus. This 
proves the formula for the a n in (2). 

(b) The negative powers in (1) and the formula for b n in (2) are obtained if we consider 
h(z) (the second integral times — l/(27n) in (3). Since z lies in the annulus, it lies in the 
exterior of the path C 2 . Hence the situation differs from that for the first integral. The 
essential point is that instead of [see (7*) in Sec. 15.4] 


(6) (a) 



we now have (b) 


z* ~ Zp 

z - z 0 


< 1. 


Consequently, we must develop the expression l/(z* — z) in the integrand of the second 
integral in (3) in powers of (z* ~ Zo)/(z — Zq) (instead of the reciprocal of this) to get a 
convergent series. We find 


1 1 

z* - Z Z* - Zo - (z - z 0 ) 


-1 


(z - Zo) 


(* 
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Compare this for a moment with (7) in Sec. 15.4, to really understand the difference. Then 
go on and apply formula (8), Sec. 15.4, for a fmite geometric sum, obtaining 



Z “ Z* \ Z — Zo / 


Multiplication by —f(z*)l2Tri and integration over C 2 on both sides now yield 


*« - - h t 


f(z*) 


2 m J c 2 z* 


dz* 


1 

2777 


— <fi f(z*) dz * + <j> ( 2 * - Zo)f(z*) dz* + ■ • • 

- J - (z - Zo) J c 2 


L z - zo ■'c: 


+ 7 — — ^ <fi (z* - z 0 ) n V(z*) dz* 

(z - zo)” J Ca 


1 


- \n+ 1 


(z - Zo) 


<f (z* - Zo) n f(z*) dz*\ + K(Z) 
c, i 


with the last term on the right given by 


(7) 


<(z) = 


I cf 

2 m(z - z 0 )" +1 c, 


(z* - Zq )" +1 
z — z* 


f(z*) dz*. 


As before, we can integrate over C instead of C 2 in the integrals on the right. We see that 
on the right, the power l/(z — zo) n is multiplied by b n as given in (2). This establishes 
Laurent’s theorem, provided 


( 8 ) 


lim R*(z) = 0. 

n— > oc 


(c) Convergence proof of (8). Very often ( 1 ) will have only finitely many negative powers. 
Then there is nothing to be proved. Otherwise, we begin by noting that /(z*)/(z — z*) in 
(7) is bounded in absolute value, say. 


/(**) 
z - z* 


< M 


for all z* on C 2 


because f(z*) is analytic in the annulus and on C 2 , and z* lies on C 2 and z outside, so 
that z — z* # 0. From this and the A/L-inequality (Sec. 14.1) applied to (7) we get the 
inequality (L = length of C 2 , \z* — Zq\ = radius of C 2 = const) 


K<z)\ s 


ML 




Z* - Zo 


ltt+1 


2 “ Zo 
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EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 


From (6b) we see that the expression on the right approaches zero as n approaches infinity. 
This proves (8). The representation (1) with coefficients (2) is now established in the given 
annulus. 

(d) Convergence of (1) in the enlarged annulus . The first series in (1) is a Taylor 
series [representing g(z)]; hence it converges in the disk D with center Zo whose radius 
equals the distance of the singularity (or singularities) closest to zo • Also, g(z) must be 
singular at all points outside C 1 where f(z) is singular. 

The second series in (1), representing h(z), is a power series in Z = l/(z — ZoX Let the 
given annulus be r 2 < \z — Zol < r l9 where t\ and r 2 are the radii of C x and C 2 , respectively 
(Fig. 367). This corresponds to l/r 2 > |Z| > 1 h\. Hence this power series in Z must 
converge at least in the disk |Z| < l// 2 . This corresponds to the exterior | z — Zo\ > r 2 of 
C 2 , so that h(z) is analytic for all z outside C 2 . Also, h(z) must be singular inside C 2 where 
f(z ) is singular, and the series of the negative powers of (1) converges for all z in the exterior 
E of the circle with center z 0 and radius equal to the maximum distance from z 0 to the 
singularities of f(z) inside C 2 . The domain common to D and E is the enlarged open annulus 
characterized near the end of Laurent’s theorem, whose proof is now complete. M 

Uniqueness. The Laurent series of a given analytic function f(z) in its annulus of 
convergence is unique (see Team Project 24). However , f(z) may have different Laurent series 
in two annuli with the same center; see the examples below. The uniqueness is essential. As 
for a Taylor series, to obtain the coefficients of Laurent series, we do not generally use the 
integral formulas (2); instead, we use various other methods, some of which we shall illustrate 
in our examples. If a Laurent series has been found by any such process, the uniqueness 
guarantees that it must be the Laurent series of the given function in the given annulus. 


Use of Maclaurin Series 

Find die Laurent series of z" 5 sin z with center 0. 
Solution , By (14), Sec. 15.4, we obtain 



(~l) n 

(2/i + 1)! 


_2n— 4 _ 
<. — 


1_ 

T 



5040 2 


+ - • • • 


(14 > 0 ). 


Here the "annulus” of convergence is the whole complex plane without die origin and die principal part of 
the series at 0 is z~ 4 - g z -2 . HI 


Substitution 

Find the Laurent series of z 2 e l!z with center 0. 

Solution, From (12) in Sec. 15.4 with z replaced by I fz we obtain a Laurent series whose principal part is 
an infinite series. 


= z2 ( l + 7k + i + '') =z2 


+ i+ i 


+ 37T + TL2 + • * * > °)- ■ 


I 

4T: 2 


Development of 1/(1 — z) 

Develop 1/(1 - z) (a) in nonnegative powers of z. 
Solution, 


(b) in negative powers of z. 




72 = 0 


(b) 



-I 

?(1 ~ ~) 



i 


(valid if |z| < I). 


(valid if |z| > 1). ■ 
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EXAMPLE 4 


EXAMPLE 5 


Laurent Expansions in Different Concentric Annuli 

Find all Laurent series of l/(z 3 - z 4 ) with center 0. 
Solution . Multiplying by 1/z 3 , we get from Example 3 
1 


(1) 


(ID 


00 111 
>4 = E^" 3= "3 + J + T + 1+2+ “' 


4 n»0 

1 


z 3 - z 4 


--2 


i i 


n = 0 '■ 


jn + 4 


Use of Partial Fractions 

-2z + 3 

Find all Taylor and Laurent series of f(z ) = — r — ’ T with center 0. 


Solution . In terms of partial fractions, 


- 3s + 2 


1 


1 


/« = -— *-2- 
(a) and (b) in Example 3 take care of the first fraction. For the second fraction, 

1 I * 1 


(c) 


(d) 


Z-2 


z - 2 


= 2 


H«) - 


B) * 


30 2 n 

= “2 3i+T 
o - 


(I) From (a) and (c), valid for \z\ < 1 (see Fig. 368), 

00 / 1 \ 3 5 9 

/« = 2 (l + * * + 4 * + g 

n-0 ' z ' 


Z* + 


(II) From (c) and (b). valid for 1 < |s| < 2, 

Si “ I II 1 , 1 

/te) = 2 ^t+T z ” _ 2 n+1 - 2 + 4 z+ 8 Z+ ’’ a 


(0 < |z| < 1). 

<W > •)• 


n=0 " K=0 

(HI) From (d) and (b), valid for |z| > 2, 


* _ 1 2 3 5 _9_ 

f<x) = ” 2 (2 +D 7 ~ ,2 “ z s A 


(W < 2). 

(I Izl > 2). 


n-0 



Fig. 368. Regions of convergence in Example 5 


If /(z) in Laurent’s theorem is analytic inside C 2 , the coefficients b n in (2) are zero by 
Cauchy’s integral theorem, so that the Laurent series reduces to a Taylor series. Examples 
3(a) and 5(1) illustrate this. 
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1-6 


LAURENT SERIES NEAR A SINGULARITY 
ATO 


Expand the given function in a Laurent series that 
converges for 0 < \z\ < R and determine the precise region 
of convergence. (Show the details of your work.) 



2 . 


z cos 


\_ 

z 



4. 


cosh 2z 


5. 


rV /2 * 



7-14 1 LAURENT SERIES NEAR A SINGULARITY 
AT i 0 

Expand the given function in a Laurent series that 
converges for 0 < \z - Zol < ^ and determine the precise 
region of convergence. (Show details.) 


7. 

9. 

11 . 

12 . 


e 


o Sin - l 

' Zo ~ l 8 ‘ (z - W ’ Z °~ zn 


Z - I 
1 

7 TT ’ 20 

i 

(z + if - (z + /) ’ 
z 3 _ . 

;\2 * Zq - I 


10 . 


cos z 


(z + if ' 

I 


13. 


z ~ 1 


Zq = 1 


14. z 2 sinh — , Zq = 0 


15-23] TAYLOR AND LAURENT SERIES 

Find all Taylor and Laurent series with center z = Zo and 
determine the precise regions of convergence. 


15. 

17. 

19. 


1 


1 - ’ C0 


= 0 


,2 


1 -z 4 
z 3 - 2/z : 
(z - if 


> Zo ~ 0 


4 z ~ 1 

21. r , zo = 0 


23. 


z 4 - 1 
sin z 


z + i'TT 


16. 


1 


1 - zf 


’ <-0 l 


18. - , zo = 1 


sinh z 

Zo = i 20. - 4 . z 0 = 1 


(z - If 
1 

22. — g i Zo — i 


— . Zo = -\ir 


24. TEAM PROJECT. Laurent Series, (a) Uniqueness. 
Prove that the Laurent expansion of a given analytic 
function in a given annulus is unique. 

(b) Accumulation of singularities. Does tan(l/z) 
have a Laurent series that converges in a region 
0 < \z\ < R? (Give a reason.) 

(c) Integrals. Expand the following functions in a 
Laurent series that converges for |^| > 0: 



25. CAS PROJECT. Partial Fractions. Write a program 
for obtaining Laurent series by the use of partial 
fractions. Using the program, verify the calculations in 
Example 5 of the text. Apply the program to two other 
functions of your choice. 


16 u Singularities and Zeros. Infinity 

Roughly, a singular point of an analytic function f(z) is a Zo at which f(z) ceases to be 
analytic, and a zero is a z at which f(z) = 0. Precise definitions follow below. In this 
section we show that Laurent series can be used for classifying singularities and Taylor 
series for discussing zeros. 

Singularities were defined in Sec. 15.4, as we shall now recall and extend. We also 
remember that, by definition, a function is a single-valued relation, as was emphasized 
in Sec. 13.3. 

We say that a function f(z) is singular or has a singularity at a point z = Zo if / (z) is 
not analytic (perhaps not even defined) at z = z 0 , but every neighborhood of z = Zo 
contains points at which f(z ) is analytic. We also say that z = Zo is a singular point of /(z). 

We call z = z 0 an isolated singularity of f(z ) if z = z 0 ha s a neighborhood without 
further singularities of /(z). Example: tan z has isolated singularities at ±7t/2, ±37t/2, etc.; 
tan (l/z) has a nonisolated singularity at 0. (Explain!) 
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EXAMPLE 1 


EXAMPLE 2 


Isolated singularities of f(z) at z = z 0 can be classified by the Laurent series 
(1) f(z) = 2 a n (z - Zo) n + 2 , * n , w (Sec. 16.1) 

valid in the immediate neighborhood of the singular point z = z 0 > except at z 0 itself, that 
is, in a region of the form 


0 < \z - z 0 | < R ■ 

The sum of the first series is analytic at z — z<>» as we ^now from the last section. The 
second series, containing the negative powers, is called the principal part of (1), as we 
remember from the last section. If it has only finitely many terms, it is of the form 


( 2 ) 



-I- 


bm 

(z - z 0 ) m 


(b m * 0). 


Then the singularity of f(z) at z = z 0 is called a pole, and m is called its order, Poles of 
the first order are also known as simple poles. 

If the principal part of (1) has infinitely many terms, we say that f(z ) has at z = z 0 an 
isolated essential singularity. 

We leave aside nonisolated singularities. 


Poles. Essential Singularities 

The function 


f(z) 


1 3 

z(z - 2) 5 + (z - 2) 2 


has a simple pole at <: = 0 and a pole of fifth order at z = 2. Examples of functions having an isolated essential 
singularity at z = 0 are 


00 1 11 
n-0 «■ 2! Z 


and 


(-If 


Sm z n ? 0 (2/i + 1 )!z 2w+1 


3 !z 3 


5U 5 


Section 16.1 provides further examples. For instance, Example 1 shows that z -5 sin z has a fourth-order pole 
at 0. Example 4 shows that l/(z 3 - z 4 ) has a third-order pole at 0 and a Laurent series with infinitely many 
negative powers. This is no contradiction, since this series is valid for \z\ > 1; it merely tells us that in classifying 
singularities it is quite important to consider the Laurent series valid in the immediate neighborhood of a singular 
point. In Example 4 Uiis is the series (I), which has three negative powers. H 


The classification of singularities into poles and essential singularities is not merely a 
formal matter, because the behavior of an analytic function in a neighborhood of an 
essential singularity is entirely different from that in the neighborhood of a pole. 


Behavior Near a Pole 

f(z) — Uz has a pole at z = 0, and |/(z)| — > oc as z — *■ 0 in any manner. This illustrates the following 
theorem. am 
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THEOREM 1 


EXAMPLE 3 


THEOREM 2 


EXAMPLE 4 


Poles 

Iff(z) is analytic and has a pole atz = Zq, then |/(z)| — » °o as z — » Zq in any manner . 


The proof is left to the student (see Prob. 12). 


Behavior Near an Essential Singularity 

The function f(z) = e has an essential singularity at z = 0. It has no limit for approach along the imaginary 
axis; it becomes infinite if z — ^ 0 through positive real values, but it approaches zero if z — > 0 through negative 
real values. It takes on any given value c = c 0 e ta 0 in an arbitrarily small e-neighborhood of z = 0. To see 
the letter, we set z = re ie , and then obtain the following complex equation for r and 0, which we must solve: 

Afz _ _(cos 0 — i sin 0)Jr _ „ Act 
ef c — Cqc . 

Equating the absolute values and the arguments, we have e (cos = c 0 , that is 


cos 0 = r In Cg« and -sin 9 = orr 

respectively. From these two equations and cos 2 6 + sin 2 9 = r 2 (ln c 0 ) 2 + a 2 r 2 = I we obtain the formulas 


1 


(In c 0 f -1- a 2 


and 


tan 6 = - 


In c 0 


Hence r can be made arbitrarily small by adding multiples of 27r to a, leaving c unaltered. This illustrates the 
very famous Picard’s theorem (with z = 0 as the exceptional value). For the rather complicated proof, see Ref. 
[D4], vol. 2, p. 258. For Picard, see Sec. 1.7. ■ 


Picard's Theorem 

If f(z) is analytic and has an isolated essential singularity at a point Zo> it takes on 
every value, with at most one exceptional value, in an arbitrarily small e-neighborhood 
ofzo- 


Removable Singularities. We say that a function f(z) has a removable singularity at 
z — Zo if fiz) is not analytic at z = Zo» but can be made analytic there by assigning a 
suitable value f(zo)- Such singularities are of no interest since they can be removed as 
just indicated. Example: f(z ) = (sin z)/z becomes analytic at z = 0 if we define /( 0) = 1 . 


Zeros of Analytic Functions 

A zero of an analytic function f(z) in a domain D is a z = z 0 in D such that f(zo) = 0. 
A zero has order n if not only / but also the derivatives • • • , / (w “ 1) are all 0 at 

z — Zo but f n \z 0 ) =£ 0. A first-order zero is also called a simple zero. For a second-order 
zero, /(z 0 ) = f f (z 0 ) = 0 but f "(zo) * 0. And so on. 

Zeros 

The function I + z 2 has simple zeros at ±i. The function (1 - z 4 ) 2 has second-order zeros at ±1 and ±i. The 
function ( z - af has a third-order zero at ’ = a. The function ^ has no zeros (see Sec. 13.5). The function 
sin z has simple zeros at 0, ±7 r, ±27 r, • • ■ , and sin 2 z has second-order zeros at these points. The function 
1 - cos z has second-order zeros at 0, ±2 tt, ±4t r, • • • , and the function (I - cos zf has fourth-order zeros 
at these points. h 
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THEOREM 3 


PROOF 


THEOREM 4 


Taylor Series at a Zero, At an mh-order zero z — Zo of f(z), the derivatives f'(zo\ • • • , 
f n ~ X \zo) are zero, by definition. Hence the first few coefficients a 0 , • • • , a n of the 
Taylor series (1), Sec. 15.4, are zero, too, whereas a n ^ 0, so that this series takes the 
form 

f(z) = a n (z - Zo) n + a n+1 (z - Zo) n+1 + • • • 

(3) 

= (z - Zo) n [o n + a n+1 (z - Zo) + a n+2 (z - z 0 ) + • • •] (a n # 0). 

This is characteristic of such a zero, because if f{z) has such a Taylor series, it has an 
nth-order zero at z = Zo> as follows by differentiation. 

Whereas nonisolated singularities may occur, for zeros we have 


Zeros 

The zeros of an analytic function f(z) 0) are isolated; that is , each of them has 

a neighborhood that contains no further zeros of /(z). 


The factor ( z - z 0 ) n in (3) is zero only at z = z 0 . The power series in the brackets 
[• • •] represents an analytic function (by Theorem 5 in Sec. 15.3), call it g{z). Now 
g(z 0 ) = a n =£ 0, since an analytic function is continuous, and because of this continuity, 
also g(z ) # 0 in some neighborhood of z — Zq. Hence the same holds of f{z). ■ 

This theorem is illustrated by the functions in Example 4. 

Poles are often caused by zeros in the denominator. {Example: tan z has poles where 
cos z is zero.) This is a major reason for the importance of zeros. The key to the connection 
is the following theorem, whose proof follows from (3) (see Team Project 24). 


Poles and Zeros 

Let f{z) be analytic at z = Zo and have a zero of nth order at z = Zq. Then \/f(z) 
has a pole of nth order at z = z 0 ; cind so does h(z)/f{z ), provided h{z) is analytic 
at z = Zo and h(z 0 ) # 0. 


Riemann Sphere. Point at Infinity 

When we want to study complex functions for large |z|, the complex plane will generally 
become rather inconvenient. Then it may be better to use a representation of complex 
numbers on the so-called Riemann sphere. This is a sphere 5 of diameter 1 touching the 
complex z-plane at z = 0 (Fig. 369), and we let the image of a point P (a number z in the 
plane) be the intersection P* of the segment PN with S, where N is the “North Pole” 
diametrically opposite to the origin in the plane. Then to each z there corresponds a point 
on S. 

Conversely, each point on S represents a complex number z, except for N, which does 
not correspond to any point in the complex plane. This suggests that we introduce an 
additional point, called the point at infinity and denoted <* (“infinity”) and let its image 
be N. The complex plane together with w is called the extended complex plane. The 
complex plane is often called the finite complex plane, for distinction, or simply the 
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N 



complex plane as before. The sphere S is called the Riemann sphere* The mapping of 
the extended complex plane onto the sphere is known as a stereographic projection. 
(What is the image of the Northern Hemisphere? Of the Western Hemisphere? Of a straight 
line through the origin?) 

Analytic or Singular at Infinity 

If we want to investigate a function f(z) for large |z|, we may now set z = 1 /w and investigate 
f(z) = /( 1/w) = g(w) in a neighborhood of w = 0. We define f(z) to be analytic or singular 
at infinity if g(w) is analytic or singular, respectively, at w = 0. We also define 

(4) g( 0) = lim g(w) 

u>-* 0 

if this limit exists. 

Furthermore, we say that f(z) has an nth-order zero at infinity if /(1/w) has such a zero 
at w = 0. Similarly for poles and essential singularities. 

EXAMPLE 5 Functions Analytic or Singular at Infinity. Entire and Meromorphic Functions 

The function f(z) = I Jz 2 is analytic at oc since g(w) = /(1/w) = w 2 is analytic at w = 0, and f(z) has a second- 
order zero at oc. The function f(z) = z 3 is singular at oo and has a third-order pole there since the function 
g(w) = /(1/w) = 1/w 3 has such a pole at w = 0. The function e z has an essential singularity at oc since e llw 
has such a singularity at w = 0. Similarly, cos z and sin z have an essential singularity at oo. 

Recall that an entire function is one that is analytic everywhere in the (finite) complex plane. Liouville’s 
theorem (Sec. 14.4) tells us that the only bounded entire functions are the constants, hence any nonconstant 
entire function must be unbounded. Hence it has a singularity at oc, a pole if it is a polynomial or an essential 
singularity if it is not. The functions just considered are typical in this respect. 

An analytic function whose only singularities in the finite plane are poles is called a meromorphic function. 
Examples are rational functions with nonconstant denominator, tan z, cot z, sec z, and esc z. B 


In this section we used Laurent series for investigating singularities. In the next section 
we shall use these series for an elegant integration method. 


|l-Hl| SINGULARITIES 

Determine die location and kind of the singularities of the 
following functions in the finite plane and at infinity. In the 
case of poles also state die order. 


1 . tan 2 777 

3. cot £ 2 
5. cos z — sin z 



4 Z 3 e ii(z-n 

6. l/(cos z — sin z) 
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sin 3 z 
7 ' (z 4 - 

4 2 8 

8 - z - 1 + (z - l) 2 (z - l) 3 
9. cosh [l/(z 2 + 1)] 10. e XKz - x 't{e z - 1) 

11. (Essential singularity) Discuss e llz2 in a similar way 
as e 1,z is discussed in Example 3. 

12. (Poles) Verify Theorem 1 for f(z) = z~ 3 - z~ l - Prove 
Theorem 1. 

13-22 1 ZEROS 

Determine the location and order of the zeros. 

13. (z + 16i) 4 14. (z 4 - 16) 4 

15. z -3 sin 3 17 z 16. cosh 2 z 

17. (3z 2 + l)e~ l 18. (z 2 - 1 )V* - 1) 

19. (z 2 + 4)(e z - l) 2 20. (sin z - l) 3 


21. (1 — cosz) 2 22. e* - e 2z 

23. (Zeros) If f(z) is analytic and has a zero of order n at 
z = Zo * show that f 2 (z) has a zero of order 2 n. 

24. TEAM PROJECT. Zeros, (a) Derivative. Show that 
if f(z) has a zero of order n > 1 at z = z 0 , then f\z ) 
has a zero of order n — 1 at z 0 - 

(b) Poles and zeros. Prove Theorem 4. 

(c) Isolated /e-points. Show that the points at which 
a nonconstant analytic function f(z) has a given value 
Ic are isolated. 

(d) Identical functions. If f x (z) are analytic in a 
domain D and equal at a sequence of points z n in D 
that converges in Z>, show that t\(z) = / 2 (z) in D. 

25. (Riemann sphere) Assuming that we let the image of 
the x-axis be meridians 0° and 180°, describe and 
sketch (or graph) the images of the following regions 
on the Riemann sphere: (a) |z| > 100, (b) the lower 
half-plane, (c) \ ^ |z| = 2. 


16 .: Residue Integration Method 

The purpose of Cauchy’s residue integration method is the evaluation of integrals 

<f> f(z)dz 


taken around a simple close path C. The idea is as follows. 

If f(z) is analytic everywhere on C and inside C, such an integral is zero by Cauchy’s 
integral theorem (Sec. 14.2), and we are done. 

If f(z ) has a singularity at a point z = z 0 inside C, but is otherwise analytic on C and 
inside C, then f(z ) has a Laurent series 


/(z) = 2 a n(z - Zo)" + 


Z ~ Z 0 


(z - Zo) 2 


that converges for all points near z = Zq (except at z = Zq itself), in some domain of the 
form 0 < \z — Zo\ < R (sometimes called a deleted neighborhood, an old-fashioned term 
that we shall not use). Now comes the key idea. The coefficient hi of the first negative 
power M{z — Zo) of this Laurent series is given by the integral formula (2) in Sec. 16.1 
with n = 1, namely. 



Now, since we can obtain Laurent series by various methods, without using the integral 
formulas for the coefficients (see the examples in Sec. 16.1), we can find b x by one of 
those methods and then use the formula for bi for evaluating the integral, that is, 

<£ f(z) dz = 2mb x . 

J c 


( 1 ) 
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EXAMPLE 1 


EXAMPLE 2 


Here we integrate conunterclockwise around a simple closed path C that contains z = Zo 
in its interior (but no other singular points of f(z) on or inside C!). 

The coefficient is called the residue of f(z) at z = z 0 and we denote it by 

(2) fci = Res f(zl 

z=z 0 


Evaluation of an Integral by Means of a Residue 

Integrate die function /(z) = z*~ 4 sin z counterclockwise around the unit circle C. 
Solution . From (14) in Sec. 15.4 we obtain the Laurent series 


smz 
Hz) = — 
z 


1 Z z_ 

3!z 5! 7! + 


which converges for |z| > 0 (that is, for all z # 0). This series shows that f(z) has a pole of third order at z = 0 
and the residue b 1 = —1/3!. From (1) we thus obtain the answer 


£ 

'c z 4 


sin z 
— dz = 2'iribi 


f rri 

T ' 


CAUTION! Use the Right Laurent Series! 

Integrate f(z ) = l/(z s - z 4 ) clockwise around the circle C: |z| = 1/2. 

Solution . z 3 — z 4 = z 3 (l — z) shows Uiat /(z) is singular at z = 0 and z = 1- Now z = 1 lies outside C. 
Hence it is of no interest here. So we need the residue of /(z) at 0. We find it from the Laurent series that 
converges for 0 < |z| < 1. This is series (I) in Example 4, Sec. 16.1, 

1 1 1 1 

“3 4 = T + T + “ + 1 + ^ ‘ (0 < |z| < 1). 

z — z z z - 

We see from it that this residue is 1 . Clockwise integration thus yields 


i 


dz 


3 4 = —27ri Res f(z) = -27 tL 

C Z ~ Z 2“0 

CAUTION! Had we used the wrong series (II) in Example 4, Sec. 16.1, 


I 


1 


1 


z 3 - z 4 


(|z| > 1), 


we would have obtained the wrong answer, 0, because this series has no power 1/z. 


Formulas for Residues 

To calculate a residue at a pole, we need not produce a whole Laurent series, but, more 
economically, we can derive formulas for residues once and for all. 

Simple Poles. Two formulas for the residue of f(z) at a simple pole at zo are 
(3) Res f(z) = b\ = lim ft - z 0 )f(z) 

Z — Zq Z — +Zg 


and, assuming that f(z) = p(z)/q(z), p(z 0 ) # 0, and q(z) has a simple zero at Zo (so that 
f(z) has at Zq a simple pole, by Theorem 4 in Sec. 16.2), 


( 4 ) 


Res f(z) = Res ^77 

z=z„ z=z 0 q(z) 


Pfe 0) 
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PROOF For a simple pole at z = Zq the Laurent series (1), Sec. 16.1, is 
f(z) = _ ~ + a Q + c>i(z ~ Zo ) + a 2 (z - z 0 ) 2 + • • • 

<. \>Q 


(0 < \z - s 0 | < R)- 


Here b x i* 0. (Why?) Multiplying both sides by z — Zo and then letting z— » Zo, we obtain 
the formula (3): 

lim (z - z 0 )f(z) = b x + lim (z - z 0 )[a 0 + «i(z - z 0 ) + • • •] = b x 

z-^z o Z—*Zo 

where the last equality follows from continuity (Theorem 1, Sec. 15.3). 

We prove (4). The Taylor series of q(z) at a simple zero z<> is 

q{z) = (z - Zo)q(z 0 ) + — ^ ~ q\z Q ) + * • * . 

Substituting this into / = plq and then f into (3) gives 

n rs \ , PW (Z ~ Z 0 )P(Z) 

Res /(z) = lim (z - z 0 ) -77 = 1,m 7~, 7, ; • 

* _;:o 2 ^o q(z) z-zo (r - Zo )[q ( Zo ) + ( z - z 0 )q (z 0 )/2 + • • •] 

z — Zo cancels. By continuity, the limit of the denominator is q\zo) and (4) follows. ■ 

EXAMPLE 3 Residue at a Simple Pole 

f(z) - (9s -1- i)/(z 3 + z) has a simple pole at / because z z + 1 = (z + i)(z - /), and (3) gives the residue 

9 z + i 9z + i r * + / I 10/ 

z=i z(Z Z + 1) z(z + l){z - i) L zb + 0 J z=i -2 

By (4) with /;(/) = 9/ + / and q'(z) = 3z 2 + I we confirm the result. 

9z + / r 9z + / 1 10/ — 

Res — 5 = — 5 = — — = —5/. ■ 

==is(s 2 +l) L 3t 2 + 1 J 2=i -2 

Poles of Any Order. The residue of /(") at an mth-order pole at z 0 is 

1 


(5) 


Res f(z) = — lim 

*=*<> (m — I)! *—‘o 


r rf ” 1 - 1 r 11 


In particular, for a second-order pole (m = 2), 
(5*) 


Res /(z) = lim {[(z - Zofm}'} . 

z=z 0 z-+z 0 


PROOF The Laurent series of /(z) converging near z 0 (except at z 0 itself) is (Sec. 16.2) 


f(z) = 


bn 


bm- 1 


( z - Zo) m (S - Jo )" 1 - 1 


+ a 0 + «i(z — s 0 ) + 


where b m # 0. The residue wanted is b v Multiplying both sides by (z - z 0 ) m gives 
(z ~ zo) m f(z) = b m + fe TO _j(s - s 0 ) + • • • + b t (z - zo) m ~ l + « 0 (s - z 0 ) m + • ■ • , 
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EXAMPLE 4 


THEOREM 1 


We see that is now the coefficient of the power ( z — z 0 ) w 1 of the power series of 
g(z) = (z - z 0 ) m f(z). Hence Taylor’s theorem (Sec. 15.4) gives (5): 




1 

(m — 1)! 


/ m_1) (Zq) 


1 

(m - l)! 


d m ~ 1 

^rrr [(z - Zo)"7(z)]. 


Residue at a Pole of Higher Order 

/(z) = 50 z/(z 3 + 2z 2 - 7z + 4) has a pole of second order at z = 1 because the denominator equals 
(z + 4)(z — l) 2 (verify!). From (5*) we obtain the residue 


Res f(z) = lira -f [(z - l) 2 /(z)] 

2 = 1 Z — * 1 UZ 



Several Singularities Inside the Contour. 

Residue Theorem 

Residue integration can be extended from the case of a single singularity to the case of 
several singularities within the contour C. This is the purpose of the residue theorem. The 
extension is surprisingly simple. 


Residue Theorem 

Let f(z) be analytic inside a simple closed path C and on C, except for finitely many 
singular points z±, z. 2 , * • * ,Zk inside C. Then the integral of f(z) taken counterclockwise 
around C equals 2m times the sum of the residues off(z) at • * * , Zi c - 


( 6 ) 




Fig. 370. Residue theorem 
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PROOF 


EXAMPLE 5 


EXAMPLE 6 


We enclose each of the singular points Zj in a circle Cj with radius small enough that those 
k circles and C are all separated (Fig. 370). Then f(z ) is analytic in the multiply connected 
domain D bounded by C and C lf • • • , C k and on the entire boundary of £>. From Cauchy’s 
integral theorem we thus have 

(7) j>f(z) dz + j> f(z ) dz + f f(z) dz + ■■■ +j> f(z) dz = 0, 

C Ci C2 Cjt 

the integral along C being taken counterclockwise and the other integrals clockwise (as in 
Figs. 351 and 352, Sec. 14.2). We take the integrals over C l9 • • • , C k to the right and 
compensate the resulting minus sign by reversing the sense of integration. Thus, 


( 8 ) 


<j> f(z) dz = <f f(z) dz + j> f(z) dz 



where all the integrals are now taken counterclockwise. By (1) and (2), 


P f(z) dz = 2m Res /(z), 

Cj z=z j 


so that (8) gives (6) and the residue theorem is proved. 


j = • # ‘ t k y 


This important theorem has various applications in connection with complex and real 
integrals. Let us first consider some complex integrals. (Real integrals follow in the next 
section.) 


Integration by the Residue Theorem. Several Contours 

Evaluate the following integral counterclockwise around any simple closed path such that (a) 0 and 1 are inside 
C, (b) 0 is inside, 1 outside, (c) I is inside, 0 outside, (d) 0 and 1 are outside. 


c£ izi* 

Jc z 2 -z 


dz 


Solution . The integrand has simple poles at 0 and 1, with residues [by (3)] 

4-3 z f 4 - 3z “| ^ 4 - 3z f 4 - 3z “| 

Res — — = — = -4, Res — — = = l. 

2=0 z(z ~ 1 ) L z - 1 J 2 =o *=i z(z- I) L z J 2=1 

[Confirm this by (4).J Ans. (a) 27 t/(-4 + 1) = - 67 ri, (b) -87 t/, (c) 2m, (d) 0. 


Another Application of the Residue Theorem 

Integrate (tanz)/(z 2 - 1) counterclockwise around the circle C: \z\ = 3/2. 

Solution . tan z is not analytic at ± 7 t/ 2, ±3tt/2, • • • T but all these points lie outside the contour C. Because 
of the denominator z 2 - 1 = (z - l)(z + 1) the given function has simple poles at ±1. We thus obtain from 
(4) and the residue theorem 



= 2m tan 1 = 9.7855/. 
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EXAMPLE 7 Poles and Essential Singularities 


Evaluate the following integral, where C is the ellipse 9a 2 + y 2 = 9 (counterclockwise, sketch it). 




Solution . Since z 4 —16 = 0 at ±2/ and ±2, the first term of the integrand has simple poles at ±2/ inside 
C, with residues [by (4); note that e 2<1ti = l] 


f ze™ 1 _ _ J|_ 

z 4 - 16 L 4z 3 J 2=2 1 16 

ze™ _ I" ze nz "1 _ 1 

z«-2i z 4 — 16 L 4z 3 J z=-2i 16 


and simple poles at ±2, which lie outside C, so that they are of no interest here. The second term of the integrand 
has an essential singularity at 0, with residue w 2 / 2 as obtained from 


'-('♦7 


tt 2 

+ 2\z 2 + 3!z 3 


: Z + TT + — H 

2 z 


A/w. 2 tt/(-^ — + 2 71 ' 2 ) = , 7r('7r 2 ~ J)i = 30.221/ by the residue theorem. 


(Id > 0). 


zEH3aEB3EEM“£E3EE33E3^= 


1. Verify the calculations in Example 3 and find the other 
residues. 

2. Verify the calculations in Example 4 and find the other 
residue. 

3-12 1 RESIDUES 

Find all the singular points and the corresponding residues. 
(Show the details of your work.) 

1 .cos z 


z 2 + 1 


7. cotz 


fe 2 “ I) 2 


11. tanz 


V. o 

z - z 
8. sec z 
1/3 


z 4 - 1 


13. CAS PROJECT. Residue at a Pole. Write a program 
for calculating the residue at a pole of any order. Use 
it for solving Probs. 3-8. 

14-25 1 RESIDUE INTEGRATION 

Evaluate (counterclockwise). (Show the details.) 




dz, C:\z-il = 2 


15. <f e llz dz, C: |z| = 1 

J c 

17. <p tan 7 tz dzy C: \z\ — 1 
J c 

18. Y tan 7 tz diy C: |z| — 2 
J c 

19. — *. C: \z\ = 4.5 
J C COS z 

20. y coth z dzy C: \z\ = 1 
J c 

21. 4 * dz, C: | z ~ »| = 1- 

Jq COS 7TZ 

22. f c z 2 - 3 Tz dz ’ C: ^ = 1 

23. <£ tan 3 7rZ dz, C; |z + |/| = 1 

24 £ (Z 2 + iX2 -2) * C ' W 


dz, C: |z - »| = 1.5 


z z - 3('z 


7 - dz. C; Izl = 1 


dz, C: Iz + |/| = 1 


J c z 

£ (z 2 + l)(2 - z) C ' ,Z| = 1 

X 30 * 2 “ 23z + 5 

2S - f i » *, C: z =l 


I 3 0z 2 - 23z + 5 

'c (2z - 1) 2 (3 z - 1) 
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16.4 Residue Integration of Real Integrals 

It is quite surprising that certain classes of complicated real integrals can be integrated 
by the residue theorem, as we shall see. 


Integrals of Rational Functions of cos 0 and sin 0 

We first consider integrals of the type 


(1) 


2* 

J = I F( cos 0, sin 0) d6 

■'A 


where F(cos 0, sin 0) is a real rational function of cos 0 and sin 0 [for example, 
(sin 2 0)/( 5 — 4 cos 0)] and is finite (does not become infinite) on the interval of integration. 
Setting e t0 = z , we obtain 


( 2 ) 


cos e = -J (e ie + e~ i0 ) = J 

sin 0 = (e*° - e~ i6 ) = 

2 / 2 / 



Since F is rational in cos 0 and sin 0, Eq. (2) shows that F is now a rational function of 
z, say, f(z). Since dz/dO = ie l °, we have dO - cfe/r'z and the given integral takes the form 

(3) J = 4 /(z) 4 1 

J c »z 

and, as 0 ranges from 0 to 2 7r in (1), the variable z = e 10 ranges counterclockwise once 
around the unit circle \z\ = 1. (Review Sec. 13.5 if necessary.) 

EXAMPLE 1 An Integral of the Type (1) 

r 2 " de 

Show by the present method that —7= = 2tt. 

J q V2 - cos 0 

Solution . We use cos 0 = 5(2 + 1/;) and dO — dz/iz. Then the integral becomes 

f— 

c -j(z z -2V2z+ I) 

_ 1 <£ * 

' Jc (z - V 2 - l)(z - V5 + l) * 



We see tliat the integrand has a simple pole at n = V 2 + 1 outside the unit circle C. so that it is of no interest 
here, and another simple pole at - 2 = v2 - 1 (where z - V2 +1=0) inside C with residue [by (3), Sec. 16.3] 


*“** (z - V2 - l)(z - V2 + 1) L z “ V2 - 1 J 2 =v^-i 

~ 2 * 

Answer: 2m(—2/i){— 1/2) = 2tt. (Here —2/i is the factor in front of the last integral.) ■ 
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As another large class, let us consider real integrals of the form 

(4) J Kx) dx. 

Such an integral, whose interval of integration is not finite is called an improper integral, 
and it has the meaning 


<5 # ) 


f /(a) dx = lim f f(x) dx + lim f f(x) dx. 

J -oc a-+—oc J n 


If both limits exist, we may couple the two independent passages to — and and write 


(5) 


f f(x) dx = lim f f(x) dx. 

•'-M R— ► GO 


The limit in (5) is called the Cauchy principal value of the integral. It is written 

pr. v. I f{x) dx. 

— 30 

It may exist even if the limits in (5*) do not. Example: 

R 2 R 2 \ f b 

— —1=0, but lim a* dx = °c. 

2 2 / 

We assume that the function /(a) in (4) is a real rational function whose denominator 
is different from zero for all real a and is of degree at least two units higher than the 
degree of the numerator. Then the limits in (5') exist, and we may start from (5). We 
consider the corresponding contour integral 



(5 s55 ) 



around a path C in Fig. 371. Since /(a) is rational, f(z ) has finitely many poles in the 
upper half-plane, and if we choose R large enough, then C encloses all these poles. By 
the residue theorem we then obtain 


f f(z ) dz = J f(z) dz. + J Kx) dx = 2m 2 Res /(z) 

C S —R 
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EXAMPLE 2 


where the sum consists of all the residues of f(z) at the points in the upper half-plane at 
which f(z) has a pole. From this we have 


( 6 ) 


f /(*) dx = 2m 2) Res f(z) - f f(z ) dz . 
-r J ° 


We prove that, if R <», the value of the integral over the semicircle S approaches 
zero. If we set z = Re 10 , then S is represented by R = co/urf, and as z ranges along 5, the 
variable 0 ranges from 0 to tt. Since, by assumption, the degree of the denominator of 
f(z) is at least two units higher than the degree of the numerator, we have 



(kl = R>Ro) 


for sufficiently large constants k and R 0 . By the ML-inequality in Sec. 14.1, 



k kir 

< 1? ,,r= T 


<* > »«>■ 


Hence, as R approaches infinity, the value of the integral over S approaches zero, and (5) 
and (6) yield the result 


(7) 


f f(x) dx = 2 m 2 Res f(z) 

J -oo 


where we sum over all the residues of f(z) at the poles of f(z ) in the upper half-plane. 

An Improper Integral from 0 to <» 

Using (7). show that 

r dx 

J 0 Ita- 4 ~ 2V2 ' 



Fig. 372. Example 2 


Solution. Indeed, f(z ) = 1/(1 + z 4 ) has four simple poles at the points (make a sketch) 


zi = 


= 3rri/4 
-2 e » 


= e- 3:ri/4 


Z4 = e-™' 4 . 


The first two of these poles lie in the upper half-plane (Fig. 372). From (4) in the last section we find the residues 



SEC 16.4 Residue Integration of Real Integrals 


721 


£s ,<a -hrrsd-«-[iL 


-3wi/4 irtf4 


4 

_L — 9iri/4 _ _j_ —TriJ& 
4 e 4 


(Here we used e m = — 1 and e 2in = 1.) By (1) in Sec. 13.6 and (7) in this section, 

00 

f dx 2 m ... _ 2m ir tt tt 

4 = - — (e ml * - e ml *) = - — • 2 < • sin — = irsn — = 

J- x I + x 4 4 4 4 4 V2 


Since 1/(1 + a 4 ) is an even function, we thus obtain, as asserted, 

dx 


f -^--1 f 

J 0 I + -V 4 2 J_ 3 


I +A' 4 2V2 ‘ 


Fourier Integrals 

The method of evaluating (4) by creating a closed contour (Fig. 371) and “blowing it up” 
extends to integrals 


( 8 ) 



cos sx dx 


and 


I f(x) sin sx dx 

— oc 


(s real) 


as they occur in connection with the Fourier integral (Sec. 1 1 .7). 

If f(x) is a rational function satisfying the assumption on the degree as for (4), we may 
consider the corresponding integral 


T f($ e1sz dz (s real and positive) 

J c 

over the contour C in Fig. 371 on p. 719. Instead of (7) we now get 

(9) /(*)«*“ dx = 2m 2 Res [f(z)e isz ) (s > 0) 

-00 

where we sum the residues of fizle™* at its poles in the upper half-plane. Equating the 
real and the imaginary parts on both sides of (9), we have 


J f(x) cos sx dx = —27 r 2 Im Res [f{z)e tsz ] i 

(10) (s > 0) 

r°° 

I f(x) sin sx dx = 27 r 2 Re Res [f(z)e lsz ]. 

•'—M 


To establish (9), we must show [as for (4)] that the value of the integral over the 
semicircle S in Fig. 371 approaches 0 as R — » <». Now s > 0 and S lies in the upper 
half-plane y ^ 0. Hence 

= \e* x+ W\ = |^| =1.^SI (s > 0, y^O). 

From this we obtain the inequality \f(z)e isz \ = |/(z)| \e iJSZ \ S |/(z)| (,y >0, y S 0). This 
reduces our present problem to that for (4). Continuing as before gives (9) and (10). ■ 
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EXAMPLE 


An Application of (10) 


Show that 


r 


cos sx 

k*+ X 2 


dx = — e 


,—ks 


r*° 

I sin sx 

J + 2 = 0 (s>0,k> 0). 


Solution . In fact, e** z f(k 2 + z 2 ) has only one pole in the upper half-plane, namely, a simple pole at z — ik. 
and from (4) in Sec. 16.3 we obtain 


Res — o 

z=ik lc 2 


,isz “ gisz ~ 

, _2 ~ 

~r Z L J2 


-ks 


Tlius 


f” 

L x k 2 + . 


dv = 27 ri 


2/A' 


2/A 


-As 


Since = cos sx + / sin sx % this yields the above results [see also (15) in Sec. 1 1.7.] 


Another Kind of Improper Integral 

We consider an improper integral 


( 11 ) 



whose integrand becomes infinite at a point a in the interval of integration. 


lim |/Cx)| - oo. 

x-*a 


By definition, this integral (11) means 

J A 


( 12 ) 


f f(x) dx = lim f f(x) dx + lim \ f(x) dx 
j a J a J a + V ’ 


where both eand rj approach zero independently and through positive values. It may happen 
that neither of these two limits exists if e and 77 go to 0 independently, but the limit 

lim j [ f(x) dx + f f(x) dx 1 
L J A J a+e J 


(13) 


exists. This is called the Cauchy principal value of the integral. It is written 


For example, 



f 1 dx f r € dx r 1 dx 1 

*■ "■ L F = is IL 7 + 1 = 0: 


the principal value exists, although the integral itself has no meaning. 

In the case of simple poles on the real axis we shall obtain a formula for the principal 
value of an integral from -» to oo. This formula will result from the following theorem. 
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THEOREM 1 


PROOF 


Simple Poles on the Real Axis 

If f(z) has a simple pole at z = a on the real axis , then (Fig. 373) 




a -r a a + r x 
Fig. 373. Theorem 1 


By the definition of a simple pole (Sec. 16.2) the integrand f(z) has for 0 < \z — a\ < R 
the Laurent series 


Hz) = -h- + g(z), b x = Res Hz). 

Z — a z=a 

Here g(z) is analytic on the semicircle of integration (Fig. 373) 

C 2 : z = a + re te , 0 ^ 0 ^ 7 r 

and for all z between C 2 and the ,*-axis, and thus bounded on C 2 , say, |g(z)| = M. By 
integration. 


[ f(z) dz= [ ire* dO + f g(z) dz = b x iri + [ g(z) dz. 

J c 2 J o re J C 2 J c 2 

The second integral on the right cannot exceed Mirr in absolute value, by the A/L-inequality 
(Sec. 14.1), and ML = Mirr — » 0 as r — > 0. ■ 

Figure 374 shows the idea of applying Theorem 1 to obtain the principal value of the 
integral of a rational function f(x) from -oo to oo For sufficiently large R the integral over 
the entire contour in Fig. 374 has the value J given by 2m times the sum of the residues 
of f(z) at the singularities in the upper half-plane. We assume that f(x) satisfies the degree 



Fig. 374. Application of Theorem 1 
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EXAMPLE 4 


condition imposed in connection with (4). Then the value of the integral over the large 
semicircle S approaches 0 as R — » $c. For r -» 0 the integral over C 2 (clockwise!) 
approaches the value 

K = —m Res f(z) 

z=a 

by Theorem 1. Together this shows that the principal value P of the integral from — oo to 
oc plus K equals J; hence P — J — K = J + m Res z=a /(z). If /(z ) has several simple 
poles on the real axis, then K will be - iri times the sum of the corresponding residues. 
Hence the desired formula is 

(14) pr. v. f f(x) dx = 27ri 2 Res /(z) + ™ 2 Res /(z) 

— oc 


where the first sum extends over all poles in the upper half-plane and the second over all 
poles on the real axis, the latter being simple by assumption. 


Poles on the Real Axis 

Find the principal value 


Solution . Since 



dx 

(a- 2 - 3.v + 2)(A- 2 + 1) ' 


.v 2 - 3 a + 2 = (.V - l)(.v - 2). 


the integrand /(*), considered for complex z, has simple poles at 

• = I, Res f(z) = I ^ 1 

«-i L (j - 2Ks 2 + I) J 

1 

“ ~ 2 ’ 

Res f{z) = I 5 1 

2=2 IKz 2 + l) ] z =2 

J_ 

" 5 ’ 

fS /<!) ■[(=’- 3= I 2K^, •)],., 

- 1 _ 3 ~ / 

6 + 2/ ” 20 1 

and at z = — / in the lower half-plane, which is of no interest here. From (14) we get the answer 

" " L 6* - 2v “ i 1 w ) + " (- i * i) - ^ ' ■ 

More integrals of the kind considered in this section are included in the problem set. Try 
also your CAS, which may sometimes give you false results on complex integrals. 
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|mT1 integrals involving cosine and sine 

Evaluate the following integrals. (Show the details of your 
work.) 


l r— ! 

J 0 7 + ( 

/•2 IT 

U 3^ 
s f — 

* J 0 5-4 sin 0 

J -2tt 

0 ITT 


do 

4* 6 cos 0 


o 37—12 cos 0 


12 cos 20 


r d0 

J 0 2 4- cos 0 

r 2 " <16 

J 0 S — 2 sin 0 

f 2xr sin 2 0 

6 . d0 

J o 5 — 4 cos 0 


Hint, cos 20 


4M) 


„ f 27r 1 + 4 cos 0 

8. I —— ~ 

J o 17-8 cos 0 


9-22 1 IMPROPER INTEGRALS: 

INFINITE INTERVAL OF INTEGRATION 

Evaluate (showing the details): 

f 1». r * 


"•JCtt: 

.3 f-±- 

J_o= (A- 2 + 4) 2 

f" AT 3 

15. : 

J.„ I + .v 

-CM 

^ f COS A* 

°* J-® a: 4 + I 

22. f 

J a: 4 + 5 


f 

(a 2 - 2 a + 5) 2 

r _a_ 

■'-oo at 4 + 16 

r dx 

(x 2 + 1 )(a 2 + 9) 


L„ (A- 2 - 2.V + If 


-LM* 


19 f -*i. 

J-oo A- 4 + 1 

„ f“ sin 3 a 
L J-oo a 4 + 1 


23-27 1 IMPROPER INTEGRALS: 

POLES ON THE REAL AXIS 

Find the Cauchy principal value (showing details): 


3 f 

J-oo A- 3 + A- 

. XT 

s - L & 

‘■M 


A 4 - 1 


dx 

A- 4 + 3a- 2 - 4 


J- CO A 4 - 1 


28. TEAM PROJECT. Comments on Real Integrals. 

(a) Formula (10) follows from (9). Give the details. 

2 

(b) Use of auxiliary results. Integrating e~ z around 
the boundary C of the rectangle with vertices — a , a, 
a + ib, — a + ib , letting a — > oo, and using 


r dx 

-oo 

f x 

r 

J_oc A 2 + 1 

10 - L A- 4 + 1 * 

*'o 


show that 


JV 

-'n 


C-*', ^ 

J* dx= 


.2 < 9 , . Vw .2 

cos 2 £a a a = — — e ° . 
2 


Loo * 4 + 5 a 2 + 4 dX 


(This integral is needed in heat conduction in Sec. 

12 . 6 .) 

(c) Inspection. Solve Probs. 15 and 21 without 
calculation. 

29. CAS EXPERIMENT. Check your CAS. Find out to 
what extent your CAS can evaluate integrals of the 
form (1), (4), and (8) correctly. Do this by comparing 
the results of direct integration (which may come out 
false) with those of using residues. 

30. CAS EXPERIMENT. Simple Poles on the Real 
Axis. Experiment with integrals /(a) dx, 
fix) = [(a - a x )(x - a 2 ) * * • (a - a^)]” 1 , a 5 real and 
all different, k > 1. Conjecture that the principal value 
of these integrals is 0. Try to prove this for a special 
k , say, k = 3. For general k. 
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16 -R-E V-TE W^Q U^E S T I O N S AND PROBLEMS 


1. Laurent series generalize Taylor series. Explain the 
details. 

2. Can a function have several Laurent series with the same 
center? Explain. If your answer is yes, give examples. 

3. What is the principal part of a Laurent series? Its 
significance? 

4. What is a pole? An essential singularity? Give 
examples. 


^ cosh 5 z . , 

21. o . . » C: |z — f| — 2 


22 . 


z 2 + 4 
4z 3 + Iz 


, C: |z + 1| = I 


cos z 

23. cot 8 z, C: |z| = 0.2 

„2 

24. 


sin z 
4z 2 - 1 


C: \z ~ 1| = 2 


5. What is Picard’s theorem? Why did it occur in this 
chapter? 


•.C:M = 


6. What is the Riemann sphere? The extended complex 
plane? Its significance? 

7. Is e Ui * analytic or singular at infinity? cosh z? ( z — 4) 3 ? 
Explain. 

8. What is the residue? Why is it important? 

9. State formulas for residues from memory. 

10. State some further methods for calculating residues. 


26 - ^rri’ c - j * 2 + y 2 = i 


27. 


28. 


_2 
" I 

“ 2 

I5z 4- 9 
z 3 - 9z 

\5z 4- 9 
z 3 - 9z 


, C: \z - 3| = 2 


, C: |z| = 4 


11. What is residue integration? To what kind of complex 
integrals does it apply? 

12. By what idea can we apply residue integration to real 
integrals from — x to x? Give simple examples. 


29-35 


REAL INTEGRALS 


Evaluate by the methods of this chapter (showing the 
details): 


13. What is a zero of an analytic function? How are zeros 
classified? 

14. What are improper integrals? Cauchy principal values? 
Give examples. 

15. Can the residue at a singular point be 0? At a simple 
pole? 

16. What is a meromorphic function? An entire function? 
Give examples. 


17-28 


COMPLEX INTEGRALS 


Integrate counterclockwise around C. (Show the details.) 


tan z , , 

17. — — , C: \z\ = 1 


sin 2z , , 

18. . C: |r| = 1 

■C 

,9 ' 5T7 ' C: 11 - "I ' 3 



dd 

25 - 24 cos 0 
dO 

— . k > 1 

k 4- cos 0 

dO 

I — | sin 0 


sin 0 

3 4- cos 0 


(1 + a - 2 ) 2 
dx 


dO 

dx 


(1 4- a 2 ) 2 
1 4- 2a* 2 


20 . 


iz 4- 1 
Z 2 - iz 4- 2 


. C: \z - 1| = 3 


36. Obtain the answer to Prob. 18 in Sec. 16.4 from the 
present Prob. 35. 
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Laurent Series. Residue Integration 


A Laurent series is a series of the form 


(1) f(z) = 2 a n(z - Zo) n + 2 , ZZ \n 

n = 0 n=l Z °' 

or. more briefly written [but this means the same as (1)!] 

« l r f(z*) 

(1*) f(z) = 2 a n (z - Zo) n , a n = — f — — 

»— oc 2m J c (z- - <o) 


n+1 


(Sec. 16.1) 


dz* 


where n = 0, ± 1 , ±2, • • • . This series converges in an open annulus (ring) A with 
center z 0 . In A the function f(z) is analytic. At points not in A it may have 
singularities. The first series in (1) is a power series. In a given annulus, a Laurent 
series of f(z) is unique, but /(z) may have different Laurent series in different annuli 
with the same center. 

Of particular importance is the Laurent series (1) that converges in a neighborhood 
of zq except at z 0 itself, say, for 0 < \z - z 0 | < R > 0, suitable). The series (or 
finite sum) of the negative powers in this Laurent series is called the principal part 
of f(z) at zo • The coefficient b ± of \/(z - z 0 ) in this series is called the residue of 
f(z) at Zq and is given by [see (1) and (1*)] 

(2) = Res f(z) = t— r f(z*) dz*. Thus <£ f(z *) dz * = 27r/ Res f(z)< 

2m J c J c 

b 1 can be used for integration as shown in (2) because it can be found from 

(3) 5S f(z) ~ [fe - • (S "' I6 ' 3) - 

provided f(z) has at z 0 a pole of order m; by definition this means that that principal 
part has 1 /(z — Zo) m as its highest negative power. Thus for a simple pole (hi = 1), 


Res f(z) = lim (z - z 0 )f(z); 

z=z 0 z-*z 0 


also, 


n PQ. = P(Zo) 
*- e 4 Cj(z) q'(Zo) ' 


If the principal part is an infinite series, the singularity of /(z) at zo is called an 
essential singularity (Sec. 16.2). 

Section 16.2 also discusses the extended complex plane, that is, the complex plane 
with an improper point °° (“infinity”) attached. 

Residue integration may also be used to evaluate certain classes of complicated 
real integrals (Sec. 16.4). 





CHAPTER 1 7 
Conformal Mapping 


If a complex function w = f(z) is defined in a domain D of the z-plane, then to each point 
in D there corresponds a point in the vi>-plane. In this way we obtain a mapping of D onto 
the range of values of f(z) in the w-plane. We shall see that if f(z) is an analytic function, 
then the mapping given by w = f(z) is conformal (angle-preserving), except at points 
where the derivative f\z ) is zero. 

Conformality appeared early in history in connection with constructing maps of the 
globe, which can be conformal (can give directions correctly) or “equiareal” (give areas 
correctly, except for a scale factor), but cannot have both properties, as can be proved 
(see [GR8] in App. 1). 

Conformality is the most important geometric property of analytic functions and gives 
the possibility of a geometric approach to complex analysis. Indeed, just as in calculus 
we use curves of real functions y = f(x) for studying “geometric” properties of functions, 
in complex analysis we can use conformal mappings for obtaining a deeper understanding 
of properties of functions, notably of those discussed in Chap. 13. 

Indeed, we shall first define the concepts of conformal mapping and then consider 
mappings by those elementary analytic functions in Chap. 13. 

This is one purpose of this chapter. A second puipose, more important to the engineer 
and physicist, is the use of conformal mapping in connection with potential problems. In 
fact, in this chapter and in the next one we shall see that conformal mapping yields a 
standard method for solving boundary value problems in (two-dimensional) potential 
theory by transforming a complicated region into a simpler one. Corresponding 
applications will concern problems from electrostatics, heat flow, and fluid flow. 

In the last section (17.5) we explain the concept of a Riemann surface, which fits well 
into the present discussion of “geometric” ideas. 

Prerequisite: Chap. 13. 

Sections that may be omitted in a shorter course: 17.3 and 17.5 

References and Answers to Problems: App. 1 Part D, App. 2. 
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17.1 Geometry of Analytic Functions: 

Conformal Mapping 

A complex function 

(1) w = f(z) = u(x, y) + iv(x, y ) {z = x + iy) 

of a complex variable z gives a mapping of its domain of definition D in the complex 
z-plane into the complex w-plane or onto its range of values in that plane. 1 For any point 
z 0 in D the point w 0 = /(z 0 ) is called the image of z 0 with respect to /. More generally, 
for the points of a curve C in D the image points form the image of C; similarly for other 
point sets in D. Also, instead of the mapping by a function w = f(z) we shall say more 
briefly the mapping w = f(z). 

EXAMPLE 1 Mapping w = /(z) = z 2 

Using polar forms z — re 10 and w = Re we have w = z 2 = r 2 e 2l °. Comparing moduli and arguments 
gives R = r 2 and <f> = 26. Hence circles r = r 0 are mapped onto circles R = r 0 2 and rays 9 = 6 0 onto rays 
<f> = 29 0 . Figure 375 shows this for the region I ^ |z| = 3/2, tt/ 6 ^9^ ir! 3, which is mapped onto the region 
1 M ^ 9/4, tt/3 ^ 9 ^ 2tt/3. 

In Cartesian coordinates we have z = .v + iy and 

u = Re U 2 ) = x 2 — y 2 , v = Im (z 2 ) - 2xy. 

Hence vertical lines x = c = const are mapped onto u = c 2 - y 2 , v = 2cy. From this we can eliminate y. We 
obtain y 2 — c 2 — u and v 2 - 4c 2 y 2 . Together, 

v 2 = 4 rV - «) (Fig- 376). 

These parabolas open to the left. Similarly, horizontal lines y = A' = co«5/ are mapped onto parabolas opening 
to the right. 

v 2 = 4 k 2 {k 2 + //) (Fig. 376). ■ 



fe-plane) (u;-plane) 

Fig. 375. Mapping w = z 2 . Lines |z| = const , arg z = const and their images in the w-plane 


x The general terminology is as follows. A mapping of a set A into a set B is called surjective or a mapping 
of A onto B if every element of B is the image of at least one element of A. It is called injective or one-to-one 
if different elements of A have different images in B . Finally, it is called bijective if it is both surjective and 
injective. 
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THEOREM 1 


PROOF 



Fig. 376, Images of x = const ; y = const under w = z 2 


Conformal Mapping 

A mapping iv = /(z) is called conformal if it preserves angles between oriented curves 
in magnitude as well as in sense. Figure 377 shows what this means. The angle 
a (0 ^ cx ^ 7t) between two intersecting curves C x and C 2 is defined to be the angle 
between their oriented tangents at the intersection point z 0 * And conformality means that 
the images C x * and C 2 * of C x and C 2 make the same angle as the curves themselves in 
both magnitude and direction. 

Conformality of Mapping by Analytic Functions 

The mapping w = f(z) by an analytic function f is conformal , except at critical 
points, that is, points at which the derivative f ' is zero. 


w = z 2 has a critical point at z = 0. where f'(z) = 2z = 0 and the angles are doubled 
(see Fig. 375), so that conformality fails. 

The idea of proof is to consider a curve 

(2) C: z(t) = x(t) + iy(f) 

in the domain of f(z) and to show that w = f(z) rotates all tangents at a point Zo (where 
f(z 0 ) # 0) through the same angle. Now z(t) = dzldt = x{t) H- iy{t) is tangent to C in 
(2) because this is the limit of (zi - Zq)/A/ (which has the direction of the secant z x — z 0 




Fig. 377. Curves C, and C 2 and their respective images 
C,* and C 2 * under a conformal mapping w = /(z) 
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EXAMPLE 


EXAMPLE 3 


in Fig. 378) as z± approaches zo along C. The image C* of C is w = f(z(t)). By the chain 
rule, w = f(z(t))z(r). Hence the tangent direction of C* is given by the argument (use 
(9) in Sec. 13.2) 

(3) arg w = arg f + arg z 

where arg z gives the tangent direction of C. This shows that the mapping rotates all 
directions at a point z 0 in the domain of analyticity of / through the same angle arg f f (zo\ 
which exists as long as f'(zo) # 0. But this means conformality, as Fig. 377 illustrates 
for an angle a between two curves, whose images C ± * and C 2 * make the same angle 
(because of the rotation). ■ 



Fig. 378. Secant and tangent of the curve C 


In the remainder of this section and in the next ones we shall consider various conformal 
mappings that are of practical interest, for instance, in modeling potential problems. 

Conformality of w = z n 

The mapping w = z n > n = 2, 3, • • • , is conformal, except at z = 0, where w f = nz n ~ l = 0. For n - 2 this is 
shown in Fig. 375; we see that at 0 the angles are doubled. For general n the angles at 0 are multiplied by a 
factor n under the mapping. Hence the sector 0^0^ irln is mapped by z n onto the upper half-plane u ^ 0 
(Fig. 379). ■ 


y | v 



Fig. 379. Mapping by w = z n 


Mapping w = z + 1/z. Joukowski Airfoil 

In terms of polar coordinates this mapping is 

»•' = « + iv = r(cos 6 -f- / sin 6) t — (cos 0 - i sin 0). 

r 

By separating the real and imaginary parts we thus obtain 

ii — ci cos 0. v = b sin 0 where a = r + — . b = / - 

r r 

Hence circles \z\ = r = const 1 are mapped onto ellipses x 2 fa 2 + y 2 lb 2 = 1. The circle r = 1 is mapped 
onto the segment -2 ^ it ^ 2 of the w-axis. See Fig. 380. 
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EXAMPLE 4 




Now the derivative of w is 


w 


\ (z+IXz-1) 

,2 ~ .2 


which is 0 at z = ± 1 . These are the points at which the mapping is not conformal. The two circles in Fig. 38 1 
pass through z = — I . The larger is mapped onto a Jankowski airfoil. The dashed circle passes through both - 1 
and 1 and is mapped onto a curved segment. 

Another interesting application of w = z + Hz (the flow around a cylinder) will be considered in Sec. 18.4. H 


y 

t; 


/ 

/ 

6 1 

'\c 

V 


P i 


— 0 
-V 

O' — o*^- 

J) 1 * -2 

2 u 


Fig. 381. Joukowski airfoil 


Conformality of w = e x 

From (10) in Sec. 13.5 we have \e z \ - e x and Arg z = y. Hence e z maps a vertical straight line .v = a* 0 = const 
onto the circle \w\ = e 0 and a horizontal straight line y = y 0 = const onto the ray arg w = y 0 . The rectangle 
in Fig. 382 is mapped onto a region bounded by circles and rays as shown. 

The fundamental region - tt < Arg z = it of e z in the --plane is mapped bijectively and conformally onto 
the entire w-plane without the origin w = 0 (because e z = 0 for no z ). Figure 383 shows that the upper half 
0 < y ^ 77 of the fundamental region is mapped onto the upper half-plane 0 < arg w ^ 7 r. the left half being 
mapped inside the unit disk \w\ ^ 1 and the right half outside (why?). I 



Fig. 382. Mapping by w = e* 


y 


K 


0 X 

(z- plane) 



(tt>-plane) 


Fig. 383. Mapping by w = e‘ 
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EXAMPLE 5 Principle of Inverse Mapping. Mapping w = Ln z 

Principle. The mapping by the inverse z = f~\w) of w = f(z) is obtained by interchanging the roles of the 
z-pUme and the \v- plane in the mapping by vr = fix). 

Now the principal value w = f(z) = Ln z of the natural logarithm has the inverse z = = e w . From 

Example 4 (with the notations ; and w interchanged!) we know that f~ l (w) = e w maps tlie fundamental region 
of the exponential function onto the z-plane without 2 = 0 (because e w ^ 0 for every w). Hence ir = f(z) = Ln z 
maps the z-plane without the origin and cut along the negative real axis (where $ = Im Ln z jumps by 2 tt) 
conformally onto the horizontal strip — 7 r < u ^ 7r of the w-plane. where w = u + iv. 

Since the mapping w = Ln z + 27 ri differs from w = Ln z by the translation I'lri (vertically upward), this 
function maps the z-plane (cut as before and 0 omitted) onto the strip tt < u ^ 3tt. Similarly for each of the 
infinitely many mappings w = In z = Ln z ± 2mri (n = 0, I, 2, • • ♦)• The corresponding horizontal strips 
of width 2 tt (images of the z-plane under these mappings) together cover the whole ir-plane without 
overlapping. M 

Magnification Ratio. By the definition of the derivative we have 


(4) 


lim 

z—+z 0 


m - f(z 0 ) 

Z - z o 




Therefore, the mapping w = f(z) magnifies (or shortens) the lengths of short lines by 
approximately the factor |/ r (z 0 )|- Th e image of a small figure conforms to the original 
figure in the sense that it has approximately the same shape. However, since f(z) varies 
from point to point, a large figure may have an image whose shape is quite different from 
that of the original figure. 

More on the Condition f(z) ^ 0. From (4) in Sec. 13.4 and the Cauchy-Riemann 
equations we obtain 


(5') \f(z)\ 2 = 

du dv 2 

dx dx 

/ du 
\dx 

■ J + 

( dv \ 2 _ dll 
\ dx ) dx 

dv 

dy 

du 

dy 

dv 

dx 

that is. 


du 

du 





(5) 

\f'(z)\ 2 = 

dx 

dv 

dy 

dv 

<)(u, v) 

d{x, .y) 






dx 

dy 






This determinant is the so-called Jacobian (Sec. 10.3) of the transformation w = f(z) 
written in real form it = u(x, y) 9 v = v(x, y). Hence /'(z 0 ) ^ 0 implies that the Jacobian 
is not 0 at z 0 . This condition is sufficient that the mapping w = f(z ) in a sufficiently small 
neighborhood of zq is one-to-one or injective (different points have different images). See 
Ref. [GR4] in App. 1. 


HSEBBSBilgM 

1. Verify all calculations in Example 1 . 

2. Why do the images of the curves |z| = const and 
arg z = const under a mapping by an analytic function 
f(z) intersect at right angles, except at points at which 
f\z) = 0? 

3. Does the mapping w = z = x — iy preserve angles in 
size as well as in sense? 



f£<l MAPPING OF CURVES 

Find and sketch or graph the image of the given curves 
under the given mapping. 

4. .v = 1. 2. 3, 4, 3- = 1, 2, 3, 4; h> = z 2 

5. Curves as in Prob. 4, w = iz (Rotation) 

6. fs| = 1/3, 1/2, 1, 2, 3; Arg z- 0, ±ir/4, ±n/2, ±3ir/2 , 
±7r; iv = l/z 
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MAPPING OF REGIONS 


Find and sketch or graph the image of the given region 
under the given mapping. 

7. — 7t/4 < Arg z < 7t/4, \z\ < 1/2, xv = z 3 


8. x = 1 , w = Mz 


9. \z\ > 1, vv = 3z 


10. Im z > 0, vv = i — z 

11. A* ^ 0, V ^ 0, \z\ ^ 4; w = z 2 

12. —1 ^ x = 1, —7T < y < tt; w = <? 


13. In 3 < a < In 5, vv = e 2 

14. — 7T < y ^37 r, vv = 

15. 2 ^ |z| ^ 3, tt/4 ^ 0 ^ 7r/2; vv = Ln z 


16. CAS EXPERIMENT. Orthogonal Nets. Graph the 
orthogonal net of the two families of level curves 
Re / (-) = const and Im /(z) = const, where 
(a) f(z) = ; 4 , (b) f(z) = Hz, (c) /( z) = l/z 2 . 
(d) f(z) = (z + 0/(1 + iz). Why do these curves 
generally intersect at right angles? In your work, 
experiment to get the best possible graphs. Also do the 
same for other functions of your own choice. Observe 
and record shortcomings of your CAS and means to 
overcome such deficiencies. 


17-23 


FAILURE OF CONFORMALITY 


Find all points at which the following mappings are not 
conformal. 


17. z(z 4 - 5) 

19. COS 7 TZ 

21. z 2 + az + b 

23. (z - tf) 3 , (z 3 - af 


18. z 2 + l/z 2 
20. cosh2z 
22. exp (z 5 - 80z) 


24-28 


MAGNIFICATION RATIO, JACOBIAN 


Find the magnification ratio M. Describe what it tells you 
about the mapping. Where is M equal to I? Find the 
Jacobian J. 


24. vv = £z 2 25. vv = <? 2 

26. w = z 3 27. vv = Ln z 


28. vv = l/z 


29. Magnification of Angles. Let f(z) be analytic at zo- 
Suppose that /'(z 0 ) = 0, • • • , f k ~ l \z 0 ) = 0. Then 
the mapping vv = f(z) magnifies angles with vertex at 
z 0 by a factor k. Illustrate tills with examples for 
k — 2,3, 4. 

30. Prove the statement in Prob. 29 for general k = 1, 
2, • • • . Hint. Use the Taylor series. 


17.2 Linear Fractional Transformations 

Conformal mappings can help in modeling and solving boundary value problems by first 
mapping regions conformally onto another. We shall explain this for standard regions 
(disks, half-planes, strips) in the next section. For this it is useful to know properties of 
special basic mappings. Accordingly, let us begin with the following very important class. 
Linear fractional transformations (or Mobius transformations) are mappings 

az + b 

(1) vv = (ad — be =£ 0) 

cz + d 


where a , b , c, d are complex or real numbers. Differentiation gives 

, _ a(cz + d) — c(az + b) _ ad — be 
(2) M ' = (cz + df " (cz + df ’ 

This motivates our requirement ad — be ¥= 0. It implies conformality for all z and excludes 
the totally uninteresting case vv' = 0 once and for all. Special cases of (1) are 


0 ) 


w = z + b 
vv = az with \a\ = 1 
vv = az + b 
vv = l/z 


(Translations) 

(Rotations) 

(Linear transfonnations) 
(Inversion in the unit circle). 
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EXAMPLE 1 


THEOREM 1 


Properties of the Inversion w = 1/z (Fig. 384) 

In polar forms z = re 10 and w = Re ul> ihe inversion w = 1 lz is 


Re 1 * ~ — 777 = — e~ l ° and gives R = — . <f> = -0. 

re r r 

Hence the unit circle |z| = r — 1 is mapped onto the unit circle |u’| = R = I ; w = = e~ t0 . For a general z 

the image w = \/z can be found geometrically by marking \w\ = R = II r on the segment from 0 to z and then 
reflecting the mark in die real axis. (Make a sketch.) 

Figure 384 shows that w = 1/z maps horizontal and vertical straight lines onto circles or straight lines. Even 
the following is true. 

w — 1 /z maps every straight line or circle onto a circle or straight line. 



Proof. Every straight line or circle in the z-plane can be written 

A(x 2 + y 2 ) + B.x + Cy + D = 0 {A, BCD real). 

A = 0 gives a straight line and A =£ 0 a circle. In terms of z and z this equation becomes 


T + 7 

Azz + C- 


- + D = 0. 


2 21 - 

Now xv = 1/z. Substitution of z = 1/w and multiplication by ww gives the equation 

w + w w — \v _ 

A + B — - — + C — — — *h Dxvw = 0 
2 2 / 


or. in terms of u and v . 


>1 + — Cv -I- D(u 2 + v 2 ) - 0. 


This represents a circle (if D =£ 0) or a straight line (if D = 0) in the u*-plane. 


The proof in this example suggests the use of z and z instead of a; and y, a general principle 
that is often quite useful in practice. 

Surprisingly, every linear fractional transformation has the property just proved: 


Circles and Straight Lines 

Every linear fractional transformation (1) maps the totality of circles and straight 
lines in the z-plane onto the totality of circles and straight lines in the w-plane. 
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PROOF This is trivial for a translation or rotation, fairly obvious for a uniform expansion or 
contraction, and true for w = 1 Iz, as just proved. Hence it also holds for composites of 
these special mappings. Now comes the key idea of the proof: represent (1) in terms of 
these special mappings. When c = 0, this is easy. When c # 0, the representation is 


w = K 


1 

cz + d 



where 


ad — be 

K= 

c 


This can be verified by substituting K , taking the common denominator and simplifying; 
this yields (1). We can now set 

= cz , w 2 = + d, w 3 = — , vi > 4 = Kw 3 , 

w 2 

and see from the previous formula that then w = w 4 + ale. This tells us that ( I ) is indeed 
a composite of those special mappings and completes the proof. ■ 


Extended Complex Plane 

The extended complex plane (the complex plane together with the point oo in Sec. 16.2) 
can now be motivated even more naturally by linear fractional transformations as follows. 

To each z for which cz + d =£ 0 there corresponds a unique w in (1). Now let e # 0. 
Then for z = —die we have cz + d = 0, so that no w corresponds to this £. This suggests 
that we let w = oo be the image of z = -die. 

Also, the inverse mapping of (1) is obtained by solving (1) for v, this gives again a 
linear fractional transformation 


(4) 


dw — b 
—cw + a 


When c # 0, then cw — a = 0 for w = ale , and we let ale be the image of z = °°. With 
these settings, the linear fractional transformation (1) is now a one-to-one mapping of the 
extended z-plane onto the extended v^-plane. We also say that every linear fractional 
transformation maps “the extended complex plane in a one-to-one manner onto itself.” 
Our discussion suggests the following. 

General Remark. If z = °°, then the right side of (1 ) becomes the meaningless expression 
(a • oo + b)/(c • «> + d). We assign to it the value w = ale if c =£ 0 and w = oo if c = 0. 

Fixed Points 

Fixed points of a mapping w = f(z) are points that are mapped onto themselves, are “kept 
fixed” under the mapping. Thus they are obtained from 

W = f(z) = z. 

The identity mapping h’ = i has eveiy point as a fixed point. The mapping w = z has 
infinitely many fixed points, vi- = l/z has two, a rotation has one, and a translation none 
in the finite plane. (Find them in each case.) For (1), the fixed-point condition w = z is 

az + b 
cz + cl 


(5) 


z = 


thus 


cz 2 - (a - d)z - b = 0. 
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This is a quadratic equation in z whose coefficients all vanish if and only if the mapping 
is the identity mapping w = z (in this case, a = d ^ 0, b = c = 0). Hence we have 


THEOREM 2 


Fixed Points 

A linear fractional transformation, not the identity, has at most two fixed points . If 
a linear fractional transformation is known to have three or more fixed points, it 
must be the identity mapping w = z- 


To make our present general discussion of linear fractional transformations even more 
useful from a practical point of view, we extend it by further facts and typical examples, 
in the problem set as well as in the next section. 






1. Verify the calculations in the proof of Theorem 1 . 

2. (Composition of LFTs) Show that substituting a linear 
fractional transformation (LFT) into a LFT gives a 
LFT. 


3. (Matrices) If you are familiar with 2X2 matrices, 
prove that the coefficient matrices of (1) and (4) are 
inverses of each other, provided ad — be = 1, and 
that the composition of LFTs corresponds to the 
multiplication of the coefficient matrices. 


4-7 


INVERSE 


Find the inverse z — 
for iv. 


z(w). Check the result by solving z(vv) 


4. vv 


4 z + / 
—3 iz + 1 



6 . 


w = 


z + i 
z - i 


7. »v = 


2 z + 5/ 
4 z 


8-14 


FIXED POINTS 


Find the fixed points. 


8. 

vv = 

81z 5 

9. 

vv = 

(4 + i)z 

10. 

w = 

z + 4i 

11. 

vv = 

(z - if 

12. 


z ~ 1 

13. 


2 iz — 1 

w = 

z + 1 

vv = 





Z + 2 i 


14. w = 


3z + 2 
z - l 


15. Find a LFT whose (only) fixed points are -2 and 2. 

16. Find a LFT (not w = z) with fixed points 0 and 1. 

17. Find all LFTs with fixed points -1 and 1. 

18. Find all LFTs whose only fixed point is 0. 

19. Find all LFTs with fixed points 0 and °°. 

20. Find all LFTs without fixed points in the finite plane. 


17.3 Special Linear Fractional Transformations 

In this section we shall see how to determine linear fractional transformations 

az + b 

(1) w = — {ad- be ^ 0) 

cz + d 

for mapping certain standard domains onto others and how to discuss properties of (1). 

A mapping (1) is determined by a , b, c, d, actually by the ratios of three of these 
constants to the fourth because we can drop or introduce a common factor. This makes it 
plausible that three conditions determine a unique mapping (1): 
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THEOREM 1 


PROOF 


EXAMPLE 1 


Three Points and Their Images Given 

Three given distinct points Z \ , z 2 , z 3 can always be mapped onto three prescribed 
distinct points w 1? w 2 , vv 3 by one , and only one f linear fractional transformation 
w = /(z). This mapping is given implicitly by the equation 

w - w x w 2 - w 3 z ~ Zi z 2 ~ z 3 

( 2 ) . 

W - W 3 W 2 - w x Z~ Z 3 Z 2 - Zi 

{If one of these points is the point x , the quotient of the two differences containing 
this point must be replaced by 1.) 


Equation (2) is of the form F(w) = G{z) with linear fractional F and G. Hence 
w = F~\G(z)) = f{z ), where F*" 1 is the inverse of F and is linear fractional (see (4) in 
Sec. 17.2) and so is the composite F _1 (G(z)) (by Prob. 21), that is, vi> = f(z) is linear 
fractional. Now if in (2) we set w = w l9 w 2 , w 3 on the left and z = Zi, z 2 , z 3 on the right, 
we see that 

F(w x ) = 0, F{w 2 ) = 1, F(w 3 ) = as 

G{z i) = 0, G{z 2 ) = 1, G{z 3 ) = oc. 

From the first column, Ffvvj) = G(z x ), thus = F" 1 (C(z 1 )) = f(zi). Similarly, vi^ 2 = f(z 2 ), 

w 3 = f(z 3 ). This proves the existence of the desired linear fractional transformation. 

To prove uniqueness, let w = g(z) be a linear fractional transformation, which also 
maps zj onto w^j = 1, 2, 3. Thus wj = g{zf). Hence ^ -1 (vv 7 ) = Zj, where Wj = 
Together, g“ 1 (/(z 7 *)) = Zj, a mapping with the three fixed points z x , z 2 , z 3 . By Theorem 2 
in Sec. 17.2, this is the identity mapping, g -1 (/(z)) = z for all z. Thus /(z) = g(z) for all 
z, the uniqueness. 

The last statement of Theorem 1 follows from the General Remark in Sec. 17.2. ■ 

Mapping of Standard Domains by Theorem 1 

Using Theorem 1. we can now find linear fractional transformations according to the 
following 

Principle. Prescribe three boundary points Zi, z 2 * z 3 of the domain D in the z-plane. 
Choose their images w x , w 2i vi> 3 on the boundary of the image D* of D in the w-plane. 
Obtain the mapping from (2). Make sure that D is mapped onto D*, not onto its 
complement. In the latter case, interchange two w-points. (Why does this help?) 

Mapping of a Half-Plane onto a Disk (Fig. 385) 

Find the linear fractional transformation (1) that maps z x = -1. z 2 = 0, z 3 = 1 onto = -1, \v 2 = 
n* 3 = 1. respectively. 

Solution . From (2) we obtain 

w-(-\) - 1 _ z-(-l) 0- 1 

w- 1 -i-(-l) ~ z- 1 * 0 — (— I) * 

thus 


w 


z - i 
~iz + 


1 ’ 




SEC. 17.3 Special Linear Fractional Transformations 


739 


EXAMPLE 


EXAMPLE 3 



Fig. 385. Linear fractional transformation in Example 1 


Let us show that we can determine the specific properties of such a mapping without much calculation. For 
z = x we have w = (.v - /)/(-/* + I), thus \w\ = 1, so that the .v-axis maps onto the unit circle. Since z = i 
gives w = 0. the upper half-plane maps onto the interior of that circle and the lower half-plane onto the exterior. 
2 = 0, /, 3c go onto w = 0, /. so that the positive imaginary axis maps onto the segment S: u = 0, - 1 

The vertical lines .v = const map onto circles (by Theorem 1 . Sec. 1 7.2) through w = / (the image of z = x ) 
and perpendicular to |n*| = 1 (by conformality; see Fig. 385). Similarly, the horizontal lines y = const map onto 
circles through \v = i and perpendicular to S (by conformality). Figure 385 gives these circles for y ^ 0. and 
for y < 0 they lie outside the unit disk shown. ■ 

Occurrence of <» 

Determine the linear fractional transformation that maps Zi = 0. z 2 = L Z3 = 20 onto W\ = — 1, w 2 = — /• 
W3 = 1, respectively. 

Solution . From (2) we obtain the desired mapping 



This is sometimes called the Cayley transformation . 2 In this case, (2) gave at first the quotient (1 — x )/(z - <*>), 
which we had to replace by 1 . I 


Mapping of a Disk onto a Half-Plane 

Find the linear fractional transformation that maps Z\ = ~\, Z2 = Z3 = I onto w l — 0, w 2 = /. w 3 = 
respectively, such that the unit disk is mapped onto the right half-plane. (Sketch disk and half-plane.) 

Solution . From (2) we obtain, after replacing (/ — «)/(u> — oc) by 1. 

z + 1 m 

W= -T~[- ■ 

Mapping half-planes onto half-planes is another task of practical interest. For instance, 
we may wish to map the upper half-plane y ^ 0 onto the upper half-plane v ^ 0. Then 
the x-axis must be mapped onto the w-axi$. 


2 ARTHUR CAYLEY (1821-1895), English mathematician and professor at Cambridge, is known for his 
important work in algebra, matrix theory, and differential equations. 
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EXAMPLE 4 


EXAMPLE 5 


Mapping of a Half-Plane onto a Half-Plane 

Find the linear fractional transformation that maps — —2, Z 2 ~ 0, zz - 2 onto w Y = oo, w 2 = 1/4, 
w z = 3/8, respectively. 

Solution. You may verify that (2) gives the mapping function 

1 

vi’ = — 

2z + 4 

What is the image of the *-axis? Of the y-axis? I 

Mappings of disks onto disks is a third class of practical problems. We may readily 
verify that the unit disk in the z-plane is mapped onto the unit disk in the w-plane by the 
following function, which maps Zq onto the center w = 0. 

(3) iv = - — ^ ’ c = kol < 1. 

CZ. ~ 1 

To see this, take \z\ = l, obtaining, with c = Zq as in (3), 

|z - Z 0 \ = \z - c\ 

= \z\ \z - c\ 

= |zz - cz\ = |1 - cz\ = | cz - l|. 

Hence 

\w\ = I z - z 0 |/|cz - 1 1 = I 

from (3), so that |z| = 1 maps onto |w| = 1, as claimed, with z 0 going onto 0, as the 
numerator in (3) shows. 

Formula (3) is illustrated by the following example. Another interesting case will be 
given in Prob. 10 of Sec. 18.2. 

Mapping of the Unit Disk onto the Unit Disk 

Taking zo = | in (3), we obtain (verify!) 

w = 7- 2 ( Rg - 386 >- ® 



Fig. 386. Mapping in Example 5 
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EXAMPLE 6 Mapping of an Angular Region onto the Unit Disk 

Certain mapping problems can be solved by combining linear fractional transformations with others. For instance, 
to map the angular region D: —irl$ ^ arg z = ir/C (Fig. 387) onto the unit disk |w| ^ 1 . we may map D by 
Z = z 1 2 3 4 5 6 onto the right Z-half-plane and then the latter onto the disk |w| ^ 1 by 

.z-i .. . m 

w = / , combined w = / -5 . ■ 

Z 1 \ £ + I 



(z-plane) 



(Z-plane) 


(«/-plane) 



Fig. 387. Mapping in Example 6 

This is the end of our discussion of linear fractional transformations. In the next section 
we turn to conformal mappings by other analytic functions (sine, cosine, etc.). 




1. Derive the mapping in Example 2 from (2). 

2. (Inverse) Find the inverse of the mapping in Example 
1. Show that under that inverse the lines a- = const are 
the images of circles in the w-plane with centers on the 
line v = 1 . 

3. Verify the formula (3) for disks. 

4. Derive the mapping in Example 4 from (2). Find its 
inverse and prove by calculation that it has the same 
fixed points as the mapping itself. Is this surprising? 

5. (Inverse) If w = f(z) is any transformation that has an 
inverse, prove the (trivial!) fact that / and its inverse 
have the same fixed points. 

6. CAS EXPERIMENT. Linear Fractional 
Transformations (LFTs). (a) Graph typical regions 
(squares, disks, etc.) and their images under the LFTs in 
Examples 1-5. 

(b) Make an experimental study of the continuous 
dependence of LFTs on their coefficients. For instance, 
change the LFT in Example 4 continuously and graph 
the changing image of a fixed region (applying 
animation if available). 

|7-I5| LFTs FROM THREE POINTS AND 
THEIR IMAGES 

Find the LFT that maps the given three points onto the three 
given points in the respective order. 


7. - 1, 0, 1 onto -0.6 - 0.8/, —1, —0.6, + 0.8/ 

8 . 0 , 1 , 2 onto 1 , 5 , | 

9. 2 ./, —2/. 4 onto —4 + 2 /, —4 — 2/, 0 

10 . /, — 1 , 1 onto — 1 , -/, / 

11. 0, L 30 onto 00 , 1, 0 

12 . 0 , —/, / onto — 1 , 0 , 00 

13. 2/, /, 0 onto §/, 2 /, <» 

14. 0, 2 /, —2/ onto — 1, 0, 00 

15. —1,0, 1 onto 0, 1, -1 

16. Find all LFTs w(z) that map the A-axis onto the w-axis. 

17. Find a LFT that maps \z\ = 1 onto \w\ ^ 1 so that 
z = ill is mapped onto w = 0. Sketch the images of 
the lines a = const and y — const. 

18. Find an analytic function that maps the second quadrant 
of the z-plane onto the interior of the unit circle in the 
vv-plane. 

19. Find an analytic function w = f(z) that maps the region 
0 ^ arg z ^ 7t/4 onto the unit disk |w| ^ 1. 

20. (Composite) Show that the composite of two LFTs is 
a LFT. 





742 


CHAP. 17 Conformal Mapping 


17*4 Conformal Mapping by Other Functions 

So far we have discussed the mapping by z 11 , e z (Sec. 17.1) and linear fractional 
transformations (Secs. 17.2, 17.3), and we shall now turn to the mapping by trigonometric 
and hyperbolic analytic functions. 




(ur-plane) 


Fig. 388. Mapping w = u + iv = sin z 


Sine Function. Figure 388 shows the mapping by 

(1) w = u + iv = sin z — sin a- coshy + i cos a* sinhy (Sec. 13.6). 
Hence 

(2) u = sin a* cosh y, v = cos a* sinh y. 

Since sin z is periodic with period 27 r, the mapping is certainly not one-to-one if we 
consider it in the full z-plane. We restrict z to the vertical strip S: —57 r ^ a* ^ \tt in 
Fig. 388. Since f'(z) = cos z = 0 at z = —2^ the mapping is not conformal at these two 
critical points. We claim that the rectangular net of straight lines a* = const and y = const 
in Fig. 388 is mapped onto a net in the w-plane consisting of hyperbolas (the images of 
the vertical lines a* = const) and ellipses (the images of the horizontal lines y = const) 
intersecting the hyperbolas at right angles (conformality!). Corresponding calculations are 
simple. From (2) and the relations sin 2 x 4* cos 2 x = 1 and cosh 2 y — sinh 2 y = 1 we obtain 

w 2 y 2 

. 2 — = cosh 2 y — sinh 2 y = 1 (Hyperbolas) 

sin a* cos A 


u v* 9 9 

75 1 — —5 — = sin x + cos a* = 1 (Ellipses). 

cosh 2 y sinh 2 y 

Exceptions are the vertical lines x = ±\ 7r, which are “folded” onto u ^ — 1 and 
u ^ 1 (v = 0), respectively. 

Figure 389 illustrates this further. The upper and lower sides of the rectangle are mapped 
onto semi-ellipses and the vertical sides onto —cosh 1 ^ a ^ - 1 and 1 ^ it ^ cosh 1 

(v = 0), respectively. An application to a potential problem will be given in Prob. 5 of 
Sec. 18.2. 
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Fig. 389. Mapping by w = sin z 


Cosine Function. The mapping w = cos z could be discussed independently, but since 

(3) w = cos z — sin (z + 577 ), 

we see at once that this is the same mapping as sin z preceded by a translation to the right 
through \tt units. 

Hyperbolic Sine. Since 

(4) w = sinh z = —i sin (iz), 

the mapping is a counterclockwise rotation Z = iz through t (i.e., 90°), followed by the 
sine mapping Z* = sin Z, followed by a clockwise 90°-rotation w = - iZ ?*. 

Hyperbolic Cosine. This function 

(5) w = cosh z ~ cos (iz) 

defines a mapping that is a rotation Z = iz followed by the mapping w = cos Z. 

Figure 390 shows the mapping of a semi-infinite strip onto a half-plane by w = cosh z. 
Since cosh 0 = 1, the point z = 0 is mapped onto w = 1. For real z = a* ^ 0, cosh z is 
real and increases with increasing a* in a monotone fashion, starting from 1. Hence the 
positive A-axis is mapped onto the portion u ^ 1 of the w-axis. 

For pure imaginary z = iy we have cosh iy = cos y. Hence the left boundary of the strip 
is mapped onto the segment 1 ^ u ^ - 1 of the «-axis, the point z = tt/ corresponding to 

W = cosh ITT = COS 7 T = — 1. 

On the upper boundary of the strip, y = 7 r, and since sin 77 = 0 and cos 7 r = — 1 , it follows 
that this part of the boundary is mapped onto the portion « ^ — 1 of the w-axis. Hence 
the boundary of the strip is mapped onto the w-axis. It is not difficult to see that the interior 
of the strip is mapped onto the upper half of the w-plane, and the mapping is one-to-one. 

This mapping in Fig. 390 has applications in potential theory, as we shall see in 
Prob. 12 of Sec. 18.3. 



Fig. 390. Mapping by w = cosh z 
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Tangent Function. Figure 391 shows the mapping of a vertical infinite strip onto the 
unit circle by w = tan z, accomplished in three steps as suggested by the representation 
(Sec. 13.6) 


sm z 

w = tan z = 

cos z 


(e™ - e-*)H 
e™ + 


(e 2fe - 1 )// 
e 2U + 1 


Hence if we set Z = e Zlz and use Mi = —i, we have 


( 6 ) 


w — tan z — —iW, 


Z- 
Z+ 1 * 


Z = e 2iz . 


We now see that w = tan z is a linear fractional transformation preceded by an exponential 
mapping (see Sec. 17.1) and followed by a clockwise rotation through an angle r (90°). 

The strip is S: — \ir <x<\i t, and we show that it is mapped onto the unit disk in the 
w-plane. Since Z = e 2tz = e ~ 2y+2ix , we see from (10) in Sec. 13.5 that |Z] = e ~ 2y , 
ArgZ = 2x. Hence the vertical lines x = — 7t/ 4, 0, 7r/4 are mapped onto the rays 
Arg Z = — 7r/2, 0, 7 t/ 2, respectively. Hence S is mapped onto the right Z-half-plane. Also 
|Z| = e~ 2y < 1 if y > 0 and |Z| > 1 if y < 0. Hence the upper half of S is mapped inside 
the unit circle |Z| = 1 and the lower half of S outside |Z| = 1, as shown in Fig. 391. 
Now comes the linear fractional transformation in (6), which we denote by g(Z): 

( 7 ) = 


For real Z this is real. Hence the real Z-axis is mapped onto the real W-axis. Furthermore, 
the imaginary Z-axis is mapped onto the unit circle |W| = 1 because for pure imaginary 
Z = iY we get from (7) 

... . *T - 1 

M = \g(iY)\ = =1. 

The right Z-half-plane is mapped inside this unit circle \W\ = 1, not outside, because 
Z = 1 has its image g(l ) = 0 inside that circle. Finally, the unit circle |Z| = 1 is mapped 



(z- plane) 


0Z- plane) (W-plane) (w-plane) 

Fig. 391. Mapping by w = tan z 
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onto the imaginary W-axis, because this circle is Z = e l4> , so that (7) gives a pure imaginary 
expression, namely, 


e i<ft - 1 e 1 * 12 - e- 1 ** 2 i sin (tf>/2) 

} e t<b 4* 1 e % * 12 + e~ t<f>/2 cos (<f>/ 2) ' 

From the W-plane we get to the vv-plane simply by a clockwise rotation through tt/ 2; see (6). 

Together we have shown that w = tan z maps 5: — 7t/4 < Rez < ir/4 onto the unit 
disk |w| = 1, with the four quarters of S mapped as indicated in Fig. 391. This mapping 
is conformal and one-to-one. 



[T— 7~| CONFORMAL MAPPING w = c z 

Find and sketch the image of the given region under w = e z . 

1. 0 = A = 2, — 7T ^ y ^ 77 

2 . — 1 = a = 0 , 0 = y = 77/2 

3. —0.5 < x < 0.5, 37t/4 < y < 5 tt!4 

4. — 3 < a* < 3, 7t/4 < y < 

5. 0<a< l,0<y< 7T 

6. x < 0, 7t/2 < y < tt!2 

7. x arbitrary, 0^y^2ir 


8. CAS EXPERIMENT. Conformal Mapping. If your 
CAS can do conformal mapping, use it to solve 
Prob. 5. Then increase y beyond 7 r, say, to 507ror 1007T. 
State what you expected. See what you get as the 
image. Explain. 


9-12 


CONFORMAL MAPPING w = sin z 


Find and sketch or graph the image of the given region 
under w = sin z. 


9. 0 = a = tt, 0 s y = 1 

10 . 0 < x < 77/6, y arbitrary 

11. 0 < a < 2tt, 1 < y < 5 

12. - 7t/4 < a* < 7t/4, 0 < y < 3 


17. Find all points at which the mapping w = cosh ttz is 
not conformal. 


18-22 


CONFORMAL MAPPING w = cos z 


Find and sketch or graph the image of the given region 
under w = cos z. 


18. 0 < a* < tt/ 2, 0 < v < 2 

19. 0 < a < 7T, 0 < y < 1 

20. -1 = x — 1,0 ^y ^ 1 

21. 7T < A < 2 7T, y < 0 

22. 0 < x < 2n t, 1/2 < y < 1 


23. Find the images of the lines y = c = const under the 
mapping w = cos z. 


24. Show that w = Ln 

z + 1 


maps the upper half-plane 


onto the horizontal strip 0 ^ Im w ^ tt as shown in 
the figure. 


A BCD E 

I I I 

(00) -1 0 1 (00) 

(z-plane) 


13. Determine all points at which w = sin z is not 
conformal. 

14. Find and sketch or graph the images of the lines a = 0, 
± 77 / 6 , ± 77 / 3 , ±7t/2 under the mapping w = sin z. 

15. Find an analytic function that maps the region R 
bounded by the positive a- and y-axes and the hyperbola 
xy = 7t/2 in the first quadrant onto the upper half-plane. 
Hint. First map the region onto a horizontal strip. 

16. Describe the mapping w = coshz in terms of the 
mapping w = sin z and rotations and translations. 


m 


£>*(~) E* = A* B*(~) 

6 

0 

( 10 -plane) 

Problem 24 

25. Find and sketch the image of R: 2 S |z| S 3, 
tt/4 ^ 0 ^ 7t/2 under the mapping w = Ln z. 
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17.5 Riemann Surfaces. Optional 

Riemann surfaces are surfaces on which multivalued relations, such as w = Vz or w = In z, 
become single-valued, that is, functions in the usual sense. We explain the idea, which is 
simple — but ingenious, one of the greatest in complex analysis. 

The mapping given by 

(1) w = u + iu = z 2 (Sec. 17.1) 

is conformal, except at z = 0, where w f = 2z = 0. At z = 0, angles are doubled under 
the mapping. Thus the right z-half-plane (including the positive y-axis) is mapped onto 
the full w-plane, cut along the negative half of the w-axis; this mapping is one-to-one. 
Similarly for the left z-half-plane (including the negative y-axis). Hence the image of the 
full z-plane under w = z 2 “covers the w-plane twice” in the sense that every w # 0 is the 
image of two z-points; if Z\ is one, the other is —Z\. For example, z = i and — / are both 
mapped onto w = — 1 . 

Now comes the crucial idea. We place those two copies of the cut w-plane upon each 
other so that the upper sheet is the image of the right half z-plane R and the lower sheet 
is the image of the left half z-plane L. We join the two sheets crosswise along the cuts 
(along the negative w-axis) so that if z moves from R to L, its image can move from the 
upper to the lower sheet. The two origins are fastened together because w = 0 is the image 
of just one z-point, z = 0. The surface obtained is called a Riemann surface (Fig. 392a). 
w = 0 is called a “winding point” or branch point, w = z 2 maps the full z-plane onto 
tills surface in a one-to-one manner. 

By interchanging the roles of the variables z and w it follows that the double-valued 
relation 

(2) vw = Vi (Sec. 13.2) 

becomes single-valued on the Riemann surface in Fig. 392a, that is, a function in the usual 
sense. We can let the upper sheet correspond to the principal value of Vz. Its image is 
the right vv’-half-plane. The other sheet is then mapped onto the left vi>-half-plane. 



Fig. 392. Riemann surfaces 


Similarly, the triple-valued relation w = Vi becomes single-valued on the three-sheeted 
Riemann surface in Fig. 392b, which also has a branch point at z = 0. 
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The infinitely many-valued natural logarithm (Sec. 13.7) 

w = Inz = Lnz 4* 2mri (n = 0, ±1, ±2, • • •) 

becomes single-valued on a Riemann surface consisting of infinitely many sheets, w = Ln z 
corresponds to one of them. This sheet is cut along the negative jc-axis and the upper edge 
of the slit is joined to the lower edge of the next sheet, which corresponds to the argument 
77 < 0 = 377, that is, to 

ir = Ln z + 27 tL 


The principal value Ln z maps its sheet onto the horizontal strip — tt < v ^ 77. The function 
w = Ln z + 2 tt/ maps its sheet onto the neighboring strip 77 < v ^ 377, and so on. The 
mapping of the points z # 0 of the Riemann surface onto the points of the w-plane is 
one-to-one. See also Example 5 in Sec. 17.1. 


_ 


1. Consider w = Vz. Find the path of the image point w 
of a point z that moves twice around the unit circle, 
starting from the initial position z = 1. 

n .r— 

2. Show that the Riemann surface of vr = Vz consists of 
n sheets and has a branch point at z = 0. 

3. Make a sketch, similar to Fig. 392, of the Riemann 
surface of 

4. Show that the Riemann surface of vr = V(z - l)(z - 2) 
has branch points at z = l and z — 2 and consists of 


two sheets that may be cut along the line segment from 
1 to 2 and joined crosswise. Hint. Introduce polar 
coordinates z — 1 = z — 2 = r 2 e° 1 2 3 4 5 6 7 8 9 . 


5-10] RIEMANN SURFACES 

Find the branch points and the number of sheets of the 
Riemann surface. 

5. V3z + 5 
7. 5 + V 2z+l 


9. e 


Vs 


6. V(1 - s 2 )(4 - z a ) 
8. In (3z - 40 

10. V? 


STIONS AND PROBLEMS 


1. How did we define the angle of intersection of two 
oriented curves, and what does it mean to say that a 
mapping is conformal? 

2. At what points is a mapping vr = /(z) by an analytic 
function not conformal? Give examples. 

3. What happens to angles at zo under a mapping vv = f(z) 
if f'(z 0 ) = 0, /"(zo) = 0. /"'(z 0 ) * 0? 

4. What do “surjective.” “injective,” and “bijeetive” 
mean? 

5. What mapping gave the Joukowski airfoil? 

6. What are linear fractional transformations (LFTs)? Why 
are they important in connection with the extended 
complex plane? 

7. Why did we require that ad — be ^ 0 for a LFT? 

8. What are fixed points of a mapping? Give examples. 

9. Can you remember mapping properties of w = sin z? 
cos z? e z l 


10. What is a Riemann surface? Why was it introduced? 
Explain die simplest example. 


11-16 


MAPPING w = z 2 


Find and sketch die image of the given curve or region under 


11. y = — 1, y = 1 12. Ay = -4 

13. jz| = 4.5, |arg z| < tt/4 14. 0 < y < 2 

15. I < x < 1 16. Im z > 0 


17-22 


MAPPING w = 1/z 


Find and sketch the image of the given curve or region under 
w = 1/z. 


17. x — -J 


18. y = 1 
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19. \z - il = | 20. \z\ < i y < 0 

21. jarg z| < tt/ 4 22. jz| < 1, a* < 0. y > 0 


23-28 1 FAILURE OF CONFORMALITY 

Where is the mapping by the given function not conformal? 
(Give reason.) 

23. 5z 7 + 7 z 5 24. cosh 2z 

25. sin 2z + cos 2z 26. cos ttz 2 

27. exp (z 4 + z 2 ) 28. z + 1/z (z * 0) 


29-34 


LINEAR FRACTIONAL 
TRANSFORMATIONS (LFTs) 


Find the LFT that maps 

29. 0, 1, 2 onto 0, /, 21, respectively 

30. — 1, 1, 2 onto 0, 2, 3/2, respectively 

31. 1, —1, -i onto 1,-1 , i. respectively 

32. —1, — / onto 1 - /, 2, 0, respectively 

33. 0, —2 onto 0, 1, respectively 

34. 0, /, 2/ onto 0, 2i 


35-40 


Fixed Points. Find all fixed points of 


35. iv 


z + 2 
z + 1 


36. iv = 


2/z — 1 
z + 2/ 


3z + 2 
37. vv = - - - 

iz + 5 
38. vv = „ 

DZ + l 

(2 + i)z+ 1 
39. w = 2 

40. vv = z 4 + z 


i 


41 — 15 


GIVEN REGIONS 


Find an analytic function w = f(z) that maps: 


41. The infinite strip 0 < y < tt/3 onto the upper half-plane 
v > 0. 


42. The interior of the unit circle |z| = 1 onto the exterior 
of the circle | w + l| = 5. 

43. Tlie region jc > 0, y > 0, at < k onto the strip 
0 < v < 1. 


44. The semi-disk \z\ < 1, x > 0 onto the exterior of the 
unit circle |»v| = 1. 

45. The sector 0 < arg z < n/3 onto the region it < 1. 




Conformal Mapping 


A complex function w = /(z) gives a mapping of its domain of definition in the 
complex z-plane onto its range of values in the complex vv-plane. If /(z) is analytic , 
this mapping is conformal, that is, angle-preserving: the images of any two 
intersecting curves make the same angle of intersection, in both magnitude and sense, 
as the curves themselves (Sec. 17.1). Exceptions are the points at which f\z) = 0 
(‘‘critical points,” e.g. z = 0 for vv = z 2 ). 

For mapping properties of e z > cosz, sinz, etc. see Secs. 17.1 and 17.4. 

Linear fractional transformations, also called Mobius transformations 

az + b 

(1) w = 7 (Secs. 17.2. 17.3) 

cz + d 

{ad - be : £ 0) map the extended complex plane (Sec. 17.2) onto itself. They solve 
the problems of mapping half-planes onto half-planes or disks, and disks onto disks 
or half-planes. Prescribing the images of three points determines (1) uniquely. 

Riemann surfaces (Sec. 1 7.5) consist of several sheets connected at certain points 
called branch points . On them, multivalued relations become single-valued, that is, 
functions in the usual sense. Examples. For w = Vz we need two sheets (with 
branch point 0) since this relation is doubly-valued. For w = In z we need infinitely 
many sheets since this relation is infinitely many-valued (see Sec. 13.7). 







chapter! 8 

Complex Analysis and 
Potential Theory 


Laplaces’ s equation V 2 <5 = 0 is one of the most important PDEs in engineering 
mathematics, because it occurs in gravitation (Secs. 9.7, 12.10), electrostatics (Sec. 9.7), 
steady-state heat conduction (Sec. 12.5), incompressible fluid flow, etc. The theory of 
solutions of this equation is called potential theory (although “potential” is also used in 
a more general sense in connection with gradients; see Sec. 9.7). 

In the “two-dimensional case” when <5 depends only on two Cartesian coordinates a* 
and y, Laplace’s equation becomes 


V 2 <E> = d> XX + <&,jy = 0. 

From Sec. 13.4 we know that then its solutions <5 are closely related to complex analytic 
functions <5-1 - / 'P. This relation is the main reason for the importance of complex analysis 
in physics and engineering. (We use the notation <t> 4- / 'P since u -I- iv will be needed 
in conformal mapping.) 

In this chapter we shall consider this connection and its consequences in detail and 
illustrate it by modeling typical examples from electrostatics (Secs. 18.1, 18.2), heat 
conduction (Sec. 18.3), and hydrodynamics (Sec. 18.4). This will lead to boundary value 
problems, some of which involving functions whose mapping properties we have studied 
in Chap. 17. Further relating to that chapter, in Sec. 18.2 we explain conformal mapping 
as a method in potential theory. 

In Sec. 18.5 we derive the important Poisson formula for potentials in a circular disk. 

Finally, in Sec. 18.6 we show that results on analytic functions can be used to 
characterize general properties of harmonic functions (solutions of Laplace’s equation 
whose second partial derivatives are continuous). 

Prerequisite: Chaps. 13, 14, 17. 

References and Answers to Problems: App. 1 Part D, App. 2. 
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18.1 Electrostatic Fields 

The electrical force of attraction or repulsion between charged particles is governed by 
Coulomb’s law. This force is the gradient of a function 4>, called the electrostatic 
potential. At any points free of charges, 4> is a solution of Laplace’s equation 

V 2 4> = 0. 

The surfaces <J> = const are called equipotential surfaces. At each point P at which the 
gradient of is not the zero vector, it is perpendicular to the surface <J> = const through 
P; that is, the electrical force has the direction perpendicular to the equipotential surface. 
(See also Secs. 9.7 and 12.10.) 

The problems we shall discuss in this entire chapter are two-dimensional (for the reason 
just given in the chapter opening), that is, they model physical systems that lie in 
three-dimensional space (of course!), but are such that the potential 4> is independent of 
one of the space coordinates, so that <5 depends only on two coordinates, which we call 
x and y. Then Laplace’s equation becomes 


( 1 ) 


V 2 $ = 


a 2 <i> d 2 <t> 

dx 2 *** dy 2 


= 0 . 


Equipotential surfaces now appear as equipotential lines (curves) in the ay-plane. 

Let us illustrate these ideas by a few simple basic examples. 

EXAMPLE 1 Potential Between Parallel Plates 

Find the potential <t> of the field between two parallel conducting plates extending to infinity (Fig. 393), which 
are kept at potentials <1^ and <I> 2 , respectively. 

Solution. From the shape of the plates it follows that <I> depends only on x, and Laplace’s equation becomes 
= 0. By integrating twice we obtain = ax + b. where the constants a and b are determined by the given 
boundary values of <I> on the plates. For example, if the plates correspond to .v = -1 and x- 1. the solution is 


<D(.v) = |(<l>2 - 4>,).V + 


The equipotential surfaces are parallel planes. 




Fig. 393. Potential in Example 1 


Fig. 394. Potential in Example 2 
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EXAMPLE 2 


EXAMPLE 3 

y 

/ 


Fig. 395. Potential 
in Example 3 


EXAMPLE 4 


Potential Between Coaxial Cylinders 

Find the potential between two coaxial conducting cylinders extending to infinity on both ends (Fig. 394) 
and kept at potentials <$>! and 4> 2 . respectively. 

Solution . Here 4> depends only on r = Vv 2 + y 2 . for reasons of symmetry, and Laplace’s equation 
r 2 u. )r + ruy + u $0 = 0 f(5). Sec. 12.9] with tt eo = 0 and u = becomes r<$>" -1- <I>' = 0. By separating variables 
and integrating we obtain 

<P" 1 # , o 

— - = , In 4> = — In r + a, 4> = — , 4> = r/ In r + b 

r r 

and a and b are determined by the given values of on the cylinders. Although no infinitely extended conductors 
exist, the field in our idealized conductor will approximate the field in a long finite conductor in that part which 
is far away from the two ends of the cylinders. H 

Potential in an Angular Region 

Find the potential between the conducting plates in Fig. 395. which are kept at potentials <I>i (the lower plate) 
and 4> 2 , and make an angle a, where 0 < a ^ tt. (In the figure we have a — 120° = 2tt/3.) 

Solution . 6 = Arg z (z = x -F /v =£ 0) is constant on rays 0 = const. It is harmonic since it is the imaginary 

part of an analytic function. Ln z (Sec. 13.7). Hence the solution is 

<P(.v. y) = a + b Arg z 

with a and b determined from the two boundary conditions (given values on the plates) 
a + b(-^a) = 4> 1? a + b($a) = 4> 2 . 

Thus a = (<I> 2 + 4M/2. b = (d> 2 - The answer is 

<P(a\ y) = ~ (4> 2 + <!>!) + — (0 2 - 4) x )61, 0 = arctan — . ■ 

2 a x 


Complex Potential 

Let y) be harmonic in some domain D and >?(*, y) a harmonic conjugate of O in D. 
(See Sec. 13.4, where we wrote it and u, now needed in conformal mapping from the next 
section on; hence the change to <b and 'P.) Then 

(2) F(z) = 0(x, y) + W(x, y) 

is an analytic function of z = x + iy. This function F is called the complex potential 
corresponding to the real potential <&. Recall from Sec. 13.4 that for given <&, a conjugate 
'P is uniquely determined except for an additive real constant. Hence we may say the 
complex potential, without causing misunderstandings. 

The use of F has two advantages, a technical one and a physical one. Technically, F is 
easier to handle than real or imaginary parts, in connection with methods of complex 
analysis. Physically, ''P has a meaning. By conformality, the curves ''P = const intersect 
the equipotential lines <E> = const in the .vy-plane at right angles [except where F'(z) = 0]. 
Hence they have the direction of the electrical force and, therefore, are called lines offeree. 
They are the paths of moving charged particles (electrons in an electron microscope, etc.). 

Complex Potential 

In Example I. a conjugate is 4* = ay. It follows that the complex potential is 


F(z) = az + b = ax + b + iay , 
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EXAMPLE 5 


EXAMPLE 6 


EXAMPLE 7 


and the lines of force are horizontal straight lines v = const parallel to the .Y-axis. M 

Complex Potential 

In Example 2 we have 4> = a In r + b = a In \z\ + b. A conjugate is 'P = a Arg z. Hence the complex 
potential is 

F(z) = a Ln z 4- b 

and the lines of force are straight lines through the origin. F(z) may also be interpreted as the complex potential 
of a source line (a wire perpendicular to the xy-plane) whose trace in die Av-plane is the origin. H 

Complex Potential 

In Example 3 we get F(z) by noting that / Ln z = i In \z\ — Arg z. multiplying this by -b. and adding a: 

F(z) = a — ib Ln z = a + b Arg z ~ ib In |z|. 

We see from this that the lines of force are concentric circles |z| = const. Can you sketch them? B 


Superposition 

More complicated potentials can often be obtained by superposition. 


Potential of a Pair of Source Lines (a Pair of Charged Wires) 

Determine the potential of a pair of oppositely charged source lines of the same strength at the points z - c and 
z- —c on the real axis. 

Solution. From Examples 2 and 5 it follows that the potential of each of the source lines is 
= /fin \z - c\ and <t> 2 = ~K In | z + c|, 

respectively. Here the real constant K measures the strength (amount of charge). These are the real parts of the 
complex potentials 

F x (z) = K Ln (z - c) and F 2 (z) = -K Ln (z + c). 

Hence the complex potential of the combination of the two source lines is 

(3) F(z) = Fi(z ) + F 2 (z) = K [Ln (z - c) - Ln (z + c)]. 


The equipotential lines are the curves 


$ = Re F(z) = K In 



= const. 


thus 


Z - c 


z + c 


= const. 


Tliese are circles, as you may show by direct calculation. The lines of force are 

= Iro F(z) = ^[Arg (z — c) — Arg (z + c)] = const. 

We write this briefly (Fig. 396) 

^ = K($ l - 0 2 ) = const. 

Now — Q 2 is the angle between the line segments from z to c and -c (Fig. 396). Hence the lines of force 
are the curves along each of which the line segment S: -c ^ x ^ c appears under a constant angle. These curves 
are the totality of circular arcs over S . as is (or should be) known from elementary geometry. Hence the lines 
of force are circles. Figure 397 shows some of them together with some equipotential lines. 

In addition to the interpretation as the potential of two source lines, this potential could also be thought of as 
the potential between two circular cylinders whose axes are parallel but do not coincide, or as the potential 
between two equal cylinders that lie outside each other, or as the potential between a cylinder and a plane wall. 
Explain this, using Fig. 397. ■ 


The idea of the complex potential as just explained is the key to a close relation of potential 
theory to complex analysis and will recur in heat flow and fluid flow. 
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z 



Fig. 396. Arguments in Example 7 



Fig. 397. Equipotential lines and lines 
of force (dashed) in Example 7 






|~P] POTENTIAL 

Find and sketch the potential. Find the complex potential: 

1. Between parallel plates at jc = -3 and 3. potentials 
140 V and 260 V, respectively 

2. Between parallel plates at ,y = -4 and 10. potentials 
4.4 kV and 10 kV, respectively 

3. Between the axes (potential 1 10 V) and the hyperbola 
xy = 1 (potential 60 V) 

4. Between parallel plates at y = x and x 4- k, potentials 
0 and 100 V, respectively 


5-8 


COAXIAL CYLINDERS 


Find the potential between two infinite coaxial cylinders of 
radii r x and r 2 having potentials U 1 and £/ 2 , respectively. 
Find the complex potential. 


5. /i = 0.5. r 2 = 2.0, U x = - 1 10 V, U 2 = 1 10 V 


6. /*! = 1. r 2 = 10. U x = 100 V.U 2 = 1 kV 

7. ri = i, r2 = 4, U x = 200 V, U 2 = 0 


8. r, = 0.1. r 2 = 10, U x = 150 V. U 2 = 50 V 


9. Show that <I> = 01 it = (1/tt) arctan (y/x) is harmonic in 
the upper half-plane and satisfies the boundary condition 
<J>(x, 0) = 1 if x < 0 and 0 if x > 0, and the corresponding 
complex potential is F(z) — —{Utt) Ln z. 

10. Map the upper half z-plane onto the unit disk \w\ = 1 so 
that 0, sc. — 1 are mapped onto 1. /, — respectively. What 
are the boundar>' conditions on |n*| = 1 resulting from 
the potential in Prob. 9? What is the potential at w = 0? 

11. Verify by calculation that the equipotential lines in 
Example 7 are circles. 

12. CAS EXPERIMENT. Complex Potentials. Graph 
the equipotential lines and lines of force in (a)-(d) (four 
graphs. Re F(z) and Im F(z) on the same axes). Then 
explore further complex potentials of your choice with 
the purpose of discovering configurations that might 
be of practical interest. 

(a) F(z) = z 2 (b) F(z) = iz 2 

(c) F(z) = 1 Iz (d) F(z) = Hz 


13-15 


POTENTIALS FOR OTHER 
CONFIGURATIONS 


13. Show that F(z) = arccoss (defined in Problem Set 
13.7) gives the potential in Figs. 398 and 399. 


y 




Fig. 399. Other apertures 


14. Find the real and complex potentials in the sector 
-7 r/6 ^ 0 ^ 7 r/6 between the boundary 9 — ±7 r/6 
(kept at 0) and the curve x 3 - 3 xy 2 = 1, kept at 1 10 V. 

15. Find the potential in the first quadrant of the xy-plane 
between the axes (having potential 220 V) and the 
hyperbola xy = 1 (having potential 110 V). 
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18.2 Use of Conformal Mapping. Modeling 

Complex potentials relate potential theory closely to complex analysis, as we have just 
seen. Another close relation results from the use of conformal mapping in modeling and 
solving boundary value problems for the Laplace equation, that is, in finding a solution 
of the equation in some domain assuming given values on the boundary (‘‘Dirichlet 
problem”; see also Sec. 12.5). Then conformal mapping is used to map a given domain 
onto one for which the solution is known or can be found more easily. This solution is 
then mapped back to the given domain. This is the idea. That it works is due to the fact 
that harmonic functions remain harmonic under conformal mapping: 


THEOREM T 


Harmonic Functions Under Conformal Mapping 

Let O* be harmonic in a domain £)* in the w-plane. Suppose that w = u 4* iv = f(z) 
is analytic in a domain D in the z-plane and maps D conformally onto Z>*. Then 
the function 

(1) $(a\ y) = <J>*(h(.v, y), v(x, y)) 

is harmonic in D. 


PROOF The composite of analytic functions is analytic, as follows from the chain rule. Hence, 
taking a harmonic conjugate u) of <!>*, as defined in Sec. 13.4, and forming the 
analytic function F*(w) = <!>*(«, v) + zM^w, u), we conclude that F(z) = F*(f(z)) is 
analytic in D. Hence its real part <E>(a\ y) = Re F(z) is harmonic in D. This completes the 
proof. 

We mention without proof that if D* is simply connected (Sec. 14.2), then a harmonic 
conjugate of <J>* exists. Another proof of Theorem 1 without the use of a harmonic 
conjugate is given in App. 4. ■ 


EXAMPLE 1 Potential Between Noncoaxial Cylinders 

Model ihe electrostatic potential between the cylinders C*: |z| = 1 and C 2 : | z — 2/5| = 2/5 in Fig. 400. Then 
give the solution for the case that is grounded, U x - 0 V, and C 2 has the potential i/ 2 = HO V. 

Solution . We map the unit disk |z| = 1 onto the unit disk |w| = 1 in such a way dial C 2 is mapped onto 
some cylinder C 2 *: |w| = r 0 . By (3), Sec. 17.3. a linear fractional transformation mapping the unit disk onto 
the unit disk is 


( 2 ) 


VI' = 


bz 


- b 

- 1 




V 


(b) w-plane 

Fig. 400. Example 1 


(a) z-plane 
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where we have chosen b = zq real without restriction, zq is of no immediate help here because centers of circles 
do not map onto centers of the images, in general. However, we now have two free constants b and r 0 and shall 
succeed by imposing two reasonable conditions, namely, that 0 and 4/5 (Fig. 400) should be mapped onto r 0 
and — r 0 , respectively. This gives by (2) 

0 - b 4/5 - b 4/5 _ r 0 

>o = ^Tj" = and with this > -'o = 4b/5 - I = 4 ,. o/5 _ , > 

a quadratic equation in r 0 with solutions r 0 = 2 (no good because r 0 < 1) and /*o = 1/2. Hence our mapping 
function (2) with b - 1/2 becomes that in Example 5 of Sec. 17.3, 

2z - 1 

(3) v = Kz) « -rry . 

From Example 5 in Sec. 18.1, writing n* for z we have as the complex potential in the vr-plane the function 
F*{w) = flLn w 4- k and from this the real potential 

<!>*(//, u) = Re F*(>r) = a In \w\ + k. 

This is our model. We now determine a and k from the boundary conditions. If \w\ = l. tlien <l>* = « In 1 + k = 0, 
hence k = 0. If |tr| = ;* 0 = 1/2. then <1>* = a In (1/2) = 1 10, hence a = 1 10/In (1/2) = - 158.7. Substitution 
of (3) now gives the desired solution in the given domain in the z-plane 

F(Z) = F*(f(z» = « Ln . 

The real potential is 

2z - 1 | 

<h(.v, y) = Re F(z) = a In _ Z o ’ a ~ “ 158.7. 

Can we “see” this result? Well, <P(.v, y) = const if and only if |(2s - l)/(z - 2)| = const, that is, \w\ = const 
by (2) with b = 1/2. These circles are images of circles in the z-plane because the inverse of a linear fractional 
transformation is linear fractional (see (4), Sec. 17.2), and any such mapping maps circles onto circles (or 
straight lines), by Theorem 1 in Sec. 17.2. Similarly for the rays arg w = const. Hence the equipotential lines 
0 (a\ y) = const are circles, and the lines of force are circular arcs (dashed in Fig. 400). These two families of 
curves intersect orthogonally, that is, at right angles, as shown in Fig. 400. ■ 

EXAMPLE 2 Potential Between Two Semicircular Plates 

Model the potential between two semicircular plates P Y and P 2 in Fig. 401a having potentials -3000 V and 
3000 V, respectively. Use Example 3 in Sec. 18.1 and conformal mapping. 

Solution, Step 1, We map the unit disk in Fig. 401a onto the right half of the ir-plane (Fig. 401b) by using 
the linear fractional transformation in Example 3, Sec. 17.3: 

1 + z 

«' = /(z) = 1 . 

I — z 



-2 kV 

(a) z-plane (b) t^plane 


Fig. 401. Example 2 
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The boundary \z\ = 1 is mapped onto the boundary // = 0 (the u-axis), with z = — 1. /, 1 going onto w = 0, /. 
respectively, and z = — i onto w = — t. Hence the upper semicircle of \z\ = I is mapped onto the upper half, 
and the lower semicircle onto the lower half of the u-axis, so that the boundary conditions in the w-plane are 
as indicated in Fig. 401b. 

Step 2. We determine the potential <&*(«, v ) in the right half-plane of the w-plane. Example 3 in Sec. 18.1 with 
a = it, Ui = -3000, and U 2 = 3000 [with $*(m, v ) instead of Q>(.\\ y)j yields 

v 

<p = arctan — . 

M 

On the positive half of the imaginary axis (<p = 7^2), this equals 3000 and on the negative half —3000. as it 
should be. <t>* is the real part of the complex potential 


6000 

<&*(«. v ) = <p, 

7T 


F*(w) = - 


6000 / 
7 T 


Ln u*. 


Step 3. We substitute the mapping function into F * to get the complex potential F(z) in Fig. 40 la in the 
form 


F(z) = F*{f(z)) = 


6000 / I + z 
Ln 


The real part of this is the potential we wanted to determine: 


6000 I + z 

4>(.v. v) = Re F(z) = 1m Ln 

7T 1 “ Z 


6000 
7 T 


Arg 


1 + z 
1 - s ‘ 


As in Example 1 we conclude that the equipotential lines <I>(.v, y) = const are circular arcs because they correspond 
to Arg [(1 + z)/(l — z)) = const . hence to Arg u* = const. Also. Arg w = const are rays from 0 to the images 
of z = —1 and 2=1, respectively. Hence the equipotential lines all have -1 and 1 (the points where the 
boundary potential jumps) as their endpoints (Fig. 40 la). The lines of force are circular arcs, too, and since they 
must be orthogonal to the equipotential lines, their centers can be obtained as intersections of tangents to the 
unit circle with the .v-axis. (Explain!) ■ 


Further examples can easily be constructed. Just take any mapping w = f(z) in Chap. 17, 
a domain D in the z-plane, its image D* in the vv-plane, and a potential <£* in Z>*. Then 
(1) gives a potential in D. Make up some examples of your own, involving, for instance, 
linear fractional transformations. 


Basic Comment on Modeling 

We formulated the examples in this section as models on the electrostatic potential. It 
is quite important to realize that this is accidental. We could equally well have phrased 
everything in terms of (time-independent) heat flow; then instead of voltages we would 
have had temperatures, the equipotential lines would have become isotherms (= lines 
of constant temperature), and the lines of the electrical force would have become lines 
along which heat flows from higher to lower temperatures (more on this in the next 
section). Or we could have talked about fluid flow; then the electrostatic lines of force 
would have become streamlines (more on this in Sec. 18.4). What we again see here is 
the unifying power of mathematics: different phenomena and systems from different 
areas in physics having the same types of model can be treated by the same mathematical 
methods. What differs from area to area is just the kinds of problems that are of practical 
interest. 



SEC. 18.3 Heat Problems 


757 


1. Verify Theorem 1 for <£*(«, v) = u 2 - v 2 < 
w = f(z ) = e z and any domain D. 

2. Verify Theorem I for <£*(//, v) = wu, w - f(z) = 
and D: x ^ 0, 0 ^ y ^ tt. Sketch D and D*. 

3. Carry out all steps of the second proof of Theorem 1 
(given in App. 4) in detail. 

4. Derive (3) from (2). 

5. Let Z)* be the image of the rectangle D: 

0 = a* = 2^ 0 = y = 1 under w = sin z, and 
<£*(//, v ) = it 2 - u 2 . Find the corresponding 
potential 4> in D and its boundary values. 

6. What happens in Prob. 5 if you replace the potential 
by the conjugate <I>* = 2wi>? Sketch or graph some of 
the equipotential lines = const. 

7. CAS PROJECT. Graphing Potential Fields. 

(a) Graph equipotential lines in Probs. 1 and 2. 

(b) Graph equipotential lines if the complex potential 
is F(z) = iz 2 . F(Z) = F(z) = ie\ F(z) = e iz . 

(c) Graph equipotential surfaces corresponding to 
F(z) = In z as cylinders in space. 

8. TEAM PROJECT. Noncoaxial Cylinders. Find the 
potential between the cylinders C x \ |z| = 1 (potential 
U x = 0) and C 2 : \z - c| = c (U 2 = 110 V), where 
0 < c <2* Sketch or graph the equipotential curves 
and their orthogonal trajectories for c = 0.1, 0.2, 0.3, 
0.4. Try to think of the further extension C x : \z\ = 1, 
C 2 : \z ~ c\ = p # c. 

9. Find the potential <b in the region R in the first quadrant 
of the z-plane bounded by the axes (having potential 
U x ) and the hyperbola y = 1 /a* (having potential 0) in 
two ways, (i) directly, (ii) by mapping R onto a suitable 
infinite strip. 


10. (Extension of Example 2) Find the linear fractional 
transformation z = g(Z) that maps |Z| ^ 1 onto \z\ = 1 
with Z = i/2 being mapped onto z — 0. Show that 
Z x = 0.6 4- 0.8/ is mapped onto z = -1 and 
Z 2 = —0.6 4- 0.8/ onto z— 1, so that the equipotential 
lines of Example 2 look in |Z| ^ 1 as shown in Fig. 402. 



Fig. 402. Problem 10 


11. The equipotential lines in Prob. 10 are circles. Why? 

12. Show that in Example 2 the y-axis is mapped onto the 
unit circle in the >v-plane. 

13. Find the complex and real potentials in the upper 
half-plane with boundary values 0 if a* < 4 and 10 kV 
if x > 4 on the A-axis. 

14. (Angular region) Applying a suitable conformal 
mapping, obtain from Fig. 401b the potential 4> in the 
angular region —57 r< Arg z<\tt such that <b = — 3 kV 
if Arg z = and <f> = 3 kV if Arg z = 577. 

15. At z = ± 1 in Fig. 401 a the tangents to the equipotential 
lines shown make equal angles (7 r/6). Why? 


18.3 Heat Problems 

Laplace’s equation also governs heat flow problems that are steady, that is, time-independent 
Indeed, heat conduction in a body of homogeneous material is modeled by the heat 
equation 

T t = c 2 V z T 

where the function T is temperature, T t = dT/dt. t is time, and c 2 is a positive constant 
(depending on the material of the body; see Sec. 12.5). Hence if a problem is steady, so 
that T t = 0, and two-dimensional, then the heat equation reduces to the two-dimensional 
Laplace equation 

(1) V 2 7 = 7]^ + Tyy = 0, 

so that the problem can be treated by our present methods. 
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EXAMPLE 1 


EXAMPLE 2 


T( a\ y) is called the heat potential. It is the real part of the complex heat potential 

F(z) = T(x, y) + /¥(*, y). 

The curves T(x , y) = co/wr are called isotherms (= lines of constant temperature) and 
the curves ^(a, y) = const heat flow lines, because along them, heat flows from higher 
to lower temperatures. 

It follows that all the examples considered so far (Secs. 18.1, 18.2) can now be 
reinterpreted as problems on heat flow. The electrostatic equipotential lines 3 >(a, y) = const 
now become isotherms T(x 9 y) = const, and the lines of electrical force become lines of 
heat flow, as in the following two problems. 

Temperature Between Parallel Plates 

Find the temperature between two parallel plates .v = 0 and x = d in Fig. 403 having temperatures 0 and 100°C. 
respectively. 

Solution . As in Example I of Sec. 1 8. 1 we conclude that T(x, y) = ax + b . From the boundary conditions, 
b - 0 and a = 100 Id. The answer is 

100 

T(x, y) = — -v[°CJ. 

The corresponding complex potential is F(z) = (100/</)z. Heat flows horizontally in the negative .v-direction 
along the lines y = const. ■ 

Temperature Distribution Between a Wire and a Cylinder 

Find die temperature Field around a long thin wire of radius i\ = l mm that is electrically heated to T 1 - 500°F 
and is surrounded by a circular cylinder of radius r 2 = 100 mm. which is kept at temperature T 2 = 60°F by 
cooling it with air. See Fig. 404. (The wire is at the origin of the coordinate system.) 

Solution* T depends only on r. for reasons of symmetry. Hence, as in Sec. 18.1 (Example 2). 

T(.v, y) = a In r 4- b. 

The boundary conditions are 


T x = 500 = « In 1 + b , T 2 = 60 = a In 100 + b. 

Hence b = 500 (since In 1 = 0) and a = (60 - />)/Jn 100 = -95.54. The answer is 

T(x\ y ) = 500 - 95.54 In r [°F1. 

The isotheims arc concentric circles. Heat flows from the wire radially outward to the cylinder. Sketch T as a 
function of r. Does it look physically reasonable? ■ 




Fig. 405. Example 3 
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EXAMPLE 3 


EXAMPLE 4 


Mathematically the calculations remain the same in the transition to another field of 
application. Physically, new problems may arise, with boundary conditions that would 
make no sense physically or would be of no practical interest. This is illustrated by the 
next two examples. 

A Mixed Boundary Value Problem 

Find the temperature distribution in the region in Fig. 405 (cross section of a solid quarter-cylinder), whose 
vertical portion of the boundary is at 20°C, the horizontal portion at 50°C t and the circular portion is insulated. 

Solution . The insulated portion of the boundary must be a heat flow line, since by the insulation, heat is 
prevented from crossing such a curve, hence heat must flow along the curve. Thus the isotherms must meet 
such a curve at right angles. Since T is constant along an isotherm, this means that 

c )T 

(2) — = 0 along an insulated portion of the boundary. 

on 

Here dT/dn is the normal derivative of 7, that is, the directional derivative (Sec. 9.7) in the direction normal 
(perpendicular) to the insulated boundary. Such a problem in which Tis prescribed on one portion of the boundary 
and dTidn on the other portion is called a mixed boundary value problem. 

In our case, the normal direction to the insulated circular boundary curve is the radial direction toward the 
origin. Hence (2) becomes BT/dr = 0. meaning that along this curve the solution must not depend on r. Now 
Arg z — 0 satisfies ( 1 ), as well as this condition, and is constant (0 and tt! 2) on the straight portions of the 
boundary. Hence the solution is of the form 


7\x, y) = a$ + b. 


The boundary conditions yield a • 7t/2 + b = 20 and a • 0 + b = 50. This gives 

60 y 

T(x, y) = 50 0, 0 = arctan — . 

7 T X 

The isotherms tire portions of rays 0 = const. Heat flows from the x-axis along circles r = const (dashed in 
Fig. 405) to die y-axis. M 




Fig. 406. Example 4 


Another Mixed Boundary Value Problem in Heat Conduction 

Find the temperature field in the upper half-plane when the x-axis is kept at T = 0°C for x < - 1, is insulated 
for - 1 < x < 1 , and is kept at T = 20°C for x > 1 (Fig. 406a). 

Solution . We map the half-plane in Fig. 406a onto die vertical strip in Fig. 406b. find the temperature T*{u, v) 
diere, and map it back to get the temperature 7(x, y) in the half-plane. 

The idea of using that strip is suggested by Fig. 388 in Sec. 17.4 with the roles of z — x + iy and w = m + iv 
interchanged. The figure shows that z = sin w maps our present strip onto our half-plane in Fig. 406a. Hence 
the inverse function 


w = f(z) = arcsin z 
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maps that half-plane onto the strip in the w-plane. This is the mapping function that we need according to 
Theorem I in Sec. 18.2. 

The insulated segment -1 < x < 1 on the .v-axis maps onto the segment -tt ! 2 < u < ttH on the //-axis. 
The rest of the .v-axis maps onto the two vertical boundary portions // = — tt/2 and 77 / 2, v > 0. of the strip. 
This gives the transformed boundary conditions in Fig. 406b for T* (//, u). where on the insulated horizontal 
boundary, r)T*ldn = dT*ldv = 0 because v is a coordinate normal to that segment. 

Similarly to Example I we obtain 

20 

T*(u , v) = 10 + — // 

7T 

which satisfies all the boundary conditions. This is the real part of the complex potential F*(ir) =10-4* (20/7 t)\v. 
Hence the complex potential in the z-plane is 


F(z) = F*(f(z)) = 10 + — arcsinz 

7T 

and 7T.v, y) = Re F(z) is the solution. The isotherms are it = const in the strip and the hyperbolas in the z-plane. 
perpendicular to which heat flows along the dashed ellipses from the 20°-portion to the cooler 0°-portion of the 
boundary, a physically very reasonable result. ■ 

This section and the last one show the usefulness of conformal mappings and complex 
potentials. The latter will also play a role in the next section on fluid flow. 




1. CAS PROJECT. Isotherms. Graph isotherms and 
lines of heat flow in Examples 2-4. Can you see from 
the graphs where the heat flow is very rapid? 

2. Find the temperature and the complex potential in an 
infinite plate with edges y = x — 2 and y = .v 4- 2 kept 
at -10°C and 20°C. respectively. 

3. Find the temperature between two parallel plates v = 0 
and y = d kept at temperatures 0°C and 100°C, 
respectively, (i) Proceed directly, (ii) Use Example 1 
and a suitable mapping. 

4. Find the temperature T in the sector 0 ^ Arg z = irf 3, 
|z| ^ 1 if T = 20°C on the x-axis, T = 50°C on 
y = V3 a, and the curved portion is insulated. 

5. Find the temperature in Fig. 405 if T - -20°C on the 
y-axis, T — 100°C on the A-axis, and the circular 
portion of the boundary is insulated as before. 

6. Interpret Prob. 10 in Sec. 18.2 as a heat flow problem 
(with boundary temperatures, say. 20°C and 300°C). 
Along what curves does the heat flow? 

7. Find the temperature and the complex potential in the 
first quadrant of the z-plane if the y-axis is kept at 
100°C, the segment 0 < a- < 1 of the A-axis is insulated 
and the portion a* > I of the A*-axis is kept at 200°C. 
Hint. Use Example 4. 

8. TEAM PROJECT. Piecewise Constant Boundary 
Temperatures, (a) A basic building block is shown 
in Fig. 407. Find the corresponding temperature and 
complex potential in the upper half-plane. 

(b) Conformal mapping. What temperature in the 
first quadrant of the z-plane is obtained from (a) by the 


mapping w = a + z 1 2 3 4 5 6 7 8 and what are the transformed 
boundary conditions? 

(c) Superposition. Find the temperature T* and the 
complex potential F* in the upper half-plane satisfying 
the boundary condition in Fig. 408. 

(d) Semi-infinite strip. Applying vr = cosh z to (c), 
obtain the solution of the boundary value problem in 
Fig. 409. 


T* = T 2 

Fig. 407. 


— o 

a _ t u 

Team Project 8(a) 


V 


-1 

1 

O 

o 


T* = 0 T* = T 0 T* - 0 u 

Fig. 408. Team Project 8(c) 



Fig. 409. Team Project 8(d) 
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1 9-14 1 TEMPERATURE DISTRIBUTIONS IN 
PLATES 

Find the temperature T(x, y) in the given thin metal plate 
whose faces are insulated and whose edges are kept at the 
indicated temperatures or are insulated as shown. 



12 . v 


r= ioo°c 


o t = 1 oo°c * 



18.4 Fluid Flow 


Laplace’s equation also plays a basic role in hydrodynamics, in steady nonviscous fluid 
flow under physical conditions discussed later in this section. In order that methods of 
complex analysis can be applied, our problems will be two-dimensional, so that the 
velocity vector V by which the motion of the fluid can be given depends only on two 
space variables x and y and the motion is the same in all planes parallel to the ry-plane. 
Then we can use for the velocity vector V a complex function 

(1) V = V x + iV 2 

giving the magnitude |V| and direction Arg V of the velocity at each point z = x 4* iy. 
Here V 1 and V 2 are the components of the velocity in the x and y directions. V is tangential 
to the path of the moving particles, called a streamline of the motion (Fig. 410). 

We show that under suitable assumptions (explained in detail following the examples), 
for a given flow there exists an analytic function 

( 2 ) F(z) = ®(x,y) + mx,y\ 

called the complex potential of the flow, such that the streamlines are given by 
^(jt, y) = const , and the velocity vector or, briefly, the velocity is given by 


(3) V =V l + iV 2 = F’{z) 



Fig. 410. Velocity 


x 
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EXAMPLE 1 


where the bar denotes the complex conjugate. W is called the stream function. The 
function $ is called the velocity potential. The curves <£(*, y) = const are called 
equipotential lines. The velocity vector V is the gradient of 4>; by definition, this means 
that 


(4) 




Vo = 


a<E> 

ay 


Indeed, for F = d> + i'i', Eq. (4) in Sec. 13.4 is F' = 4> x + /'P* with ^ = — $> y by the 
second Cauchy-Riemann equation. Together we obtain (3): 


F' (z) = - i% = <I>* + i% = V 1 + iV 2 = V. 

Furthermore, since F(z) is analytic, <3> and ^ satisfy Laplace’s equation 


„ d 2 <3> 

(5) v 2 0> = — + — = 0, 


dx z 


dy‘ 


o T d 2 V d 2 * n 
V 2 ¥ = — 5- + ~zr~o~ = 0. 


dx 




Whereas in electrostatics the boundaries (conducting plates) are equipotential lines, in 
fluid flow the boundaries across which fluid cannot flow must be streamlines. Hence in 
fluid flow the stream function is of particular importance. 

Before discussing the conditions for the validity of the statements involving (2)-(5), let 
us consider two flows of practical interest, so that we first see what is going on from a 
practical point of view. Further flows are included in the problem set. 


Flow Around a Corner 

The complex potential F(z) = z 2 = x 2 — y 2 + 2/\vv models a flow with 


Equipotential lines 

= .v 2 - y 2 = const 

(Hyperbolas) 

Streamlines 

'P = 2 xy = const 

(Hyperbolas). 

From (3) we obtain the velocity vector 



V=2l = 2(.v - iy\ 

that is, ^ = 2a\ 

^ 2 = 

The speed (magnitude of the velocity) is 




\V\ = VVj 2 + V 2 2 = 2V.V 2 + y 2 


The flow may be interpreted as the flow in a channel bounded by the positive coordinates axes and a hyperbola, 
say, xy = 1 (Fig. 411). We note that the speed along a streamline 5 has a minimum at the point P where the 
cross section of the channel is large. ■ 


y 

s 



Fig. 411. Flow around a corner (Example 1) 
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EXAMPLE 2 Flow Around a Cylinder 

Consider the complex potential 


F(z) = $(a\ v) + Z'l'(.v. v) = z + 


1 


Using the polar form z = re u \ we obtain 

F{z) — re l ° + — e~ tB = + -j j cos 0 + / — - j sin 0. 

Hence the streamlines are 

'Rv. y) = - y j sin 0 = ■ 


COWS/. 


In particular, 'I'Cv, y) = 0 gives r — Mr = 0 or sin 0 - 0. Hence this streamline consists of the unit circle (r = Mr 
gives r = 1) and the .v-axis (0 = 0 and 0 — tt). For large |s| the term Mz in F(z) is small in absolute value, so 
that for these z the flow is nearly uniform and parallel to the .v-axis. Hence we can interpret this as a flow around 
a long circular cylinder of unit radius that is perpendicular to the z-plane and intersects it in the unit circle |z| = I 
and whose axis corresponds to - = 0. 

The flow has two stagnation points (that is, points at which the velocity V is zero), at c = ±1. This follows 
from (3) and 


F\z) = 1 



hence z 2 — I = 0. 


(See Fig. 412.) ■ 



Fig. 412. Flow around a cylinder (Example 2) 


Assumptions and Theory Underlying (2)-(5) 


THEOREM 1 


Complex Potential of a Flow 

If the domain of flow is simply connected and the flow is irrotational and 
incompressible , then the statements involving (2)-(5) hold . In particular ; then the 
flow has a complex potential F(z ), which is an analytic function. (Explanation of 
terms below.) 


PROOF We prove this theorem, along with a discussion of basic concepts related to fluid flow. 

(a) First Assumption: Irrotational Let C be any smooth curve in the z-plane given 
by z(s) = x(s) + iy(s) 9 where s is the arc length of C. Let the real variable V t be the 
component of the velocity V tangent to C (Fig. 413). Then the value of the real line integral 

f Vt ds 
J c 


( 6 ) 
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X 

Fig. 413. Tangential component of the 
velocity with respect to a curve C 


taken along C in the sense of increasing s is called the circulation of the fluid along C. 
a name that will be motivated as we proceed in this proof. Dividing the circulation by the 
length of C, we obtain the mean velocity 1 of the flow along the curve C. Now 

V t = |V| cos a (Fig. 413). 

Hence V t is the dot product (Sec. 9.2) of V and the tangent vector dzlds of C (Sec. 17.1); 
thus in (6). 

( dx dy\ 

V t dS = ( V ' + V2 ^ ) ds = * + V 2 dy- 

The circulation (6) along C now becomes 

(7) [v t ds= f (V, dx + V 2 dy ). 

J c J c 

As the next idea, let C be a closed curve satisfying the assumption as in Green’s theorem 
(Sec. 10.4), and let Cbe the boundary of a simply connected domain D. Suppose further 
that V has continuous partial derivatives in a domain containing D and C. Then we can 
use Green’s theorem to represent the circulation around C by a double integral, 

( 8 ) ^V 1 dx+Vzdy) = jj^-^dx<l,. 

The integrand of this double integral is called the vorticity of the flow. The vorticity 
divided by 2 is called the rotation 


( 9 ) 


o>(*, y) = 


1 

2 \ dx 


ciM 

3y / 


i r 

1 Definitions: -r~Z — I /C*) ^.v = mean value of / on the interval a Sx £ b, 
o a J ^ 


l f 

L J n 


f(s) ds = mean value of / on C (L = length of C), 


J ff fix, y ) dx dy = mean value of / on D (A = area of D). 
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We assume the flow to be irrotational, that is, (o(x , y) = 0 throughout the flow; thus, 


( 10 ) 


BV 2 dV 1 

dx dy 


To understand the physical meaning of vorticity and rotation, take for C in (8) a circle. 
Let r be the radius of C. Then the circulation divided by the length 27ir of C is the mean 
velocity of the fluid along C. Hence by dividing this by r we obtain the mean angular 
velocity <u 0 of the fluid about the center of the circle: 


o> 0 = 



dV 1 \ 1 rr 

—— I dx dy = — 2 a)(x . v) dx dy. 

dy ) ’ -nr 


If we now let r — » 0, the limit of co 0 is the value of (o at the center of C. Hence a>(; c, y) 
is the limiting angular velocity of a circular element of the fluid as the circle shrinks to 
the point (jt, >*)• Roughly speaking, if a spherical element of the fluid were suddenly 
solidified and the surrounding fluid simultaneously annihilated, the element would rotate 
with the angular velocity co. 

(b) Second Assumption: Incompressible . Our second assumption is that the fluid is 
incompressible. (Fluids include liquids, which are incompressible, and gases, such as air, 
which are compressible.) Then 


(ID 



dV 2 

dy 


= 0 


in every region that is free of sources or sinks, that is, points at which fluid is produced 
or disappears, respectively. The expression in (11) is called the divergence of V and is 
denoted by div V. (See also (7) in Sec. 9.8.) 

(c) Complex Velocity Potential If the domain D of the flow is simply connected 
(Sec. 14.2) and the flow is irrotational, then (10) implies that the line integral (7) is 
independent of path in D (by Theorem 3 in Sec. 10.2, where F 1 — Vi, F 2 — V* F 3 — 0, 
and z is the third coordinate in space and has nothing to do with our present z). Hence if 
we integrate from a fixed point ( a , b) in D to a variable point ( x , y) in D, the integral 
becomes a function of the point (a*, y), say, <f>(x, y): 

Jx, y ) 

(12) <D(a, y) = (V x dx + V 2 dy). 

J (a, b ) 


We claim that the flow has a velocity potential 3>, which is given by (12). To prove this, 
all we have to do is to show that (4) holds. Now since the integral (7) is independent of 
path, V l dx + V 2 dy is exact (Sec. 10.2), namely, the differential of <I>, that is, 

d3> 

V ± dx + V 2 dy = — - dx + — — dy. 

dx dy 

From this we see that V x = d<S>/dx and V 2 = d<3?/dy, which gives (4). 

That <l> is harmonic follows at once by substituting (4) into (11), which gives the first 
Laplace equation in (5). 
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We finally take a harmonic conjugate of $. Then the other equation in (5) holds. 
Also, since the second partial derivatives of and are continuous, we see that the 
complex function 

F(z) = ®(r, y) + y) 

is analytic in D . Since the curves \P(a\ y) = const are perpendicular to the equipotential 
curves <t>(x, y) = const (except where F\z) = 0), we conclude that x P(a\ y) = cootf are 
the streamlines. Hence 'f is the stream function and F(z) is the complex potential of the 
flow. This completes the proof of Theorem 1 as well as our discussion of the important 
role of complex analysis in compressible fluid flow. ■ 


! PROBLEM SET 18.4 


1 1-15 1 FLOW PATTERNS: STREAMLINES, 
COMPLEX POTENTIAL 

These problems should encourage you to experiment with 
various functions F(z), many of which model interesting 
flow patterns. 

1. (Parallel flow) Show that F(z) = —iKz (K positive 
real) describes a uniform flow upward, which can be 
interpreted as a uniform flow between two parallel lines 
(parallel planes in three-dimensional space). See 
Fig. 414. Find the velocity vector, the streamlines, and 
the equipotential lines. 

y\ 



Fig. 414. Parallel flow in Problem 1 

2. (Conformal mapping) Obtain the flow in Example 1 
from that in Prob. 1 by a suitable conformal mapping. 

3. Find the complex potential of a uniform flow parallel 
to the .v-axis in the positive ^-direction. 

4. What happens to the flow in Prob. 1 if you replace z 
by ze~ ttt with constant a, e.g., a = 7t/4? 

5. What is the complex potential of an upward parallel 
flow in the direction of y = 2.v? 

6. (Extension of Example 1) Sketch or graph the flow in 
Example I on the whole upper half-plane. Show that 
you can interpret it as as flow against a horizontal wall 
(the .v-axis). 


7. What F(z) would be suitable in Example 1 if the angle 
of the comer were tt/3? 

8. Sketch or graph the streamlines and equipotential lines 
of F(z) = iz 3 4 5 6 . Find V. Find all points at which V is 
parallel to the A-axis. 

9. Find and graph the streamlines of F(z) = z 2 4- 2 z. 
Interpret the flow. 

10. Show that F(z) = iz 2 models a flow around a corner. 
Sketch the streamlines and equipotential lines. Find V. 

11. (Potential F(z) = 1/z) Show that the streamlines of 
F(z) = 1/z are circles through the origin. 

12. (Cylinder) What happens in Example 2 if you replace 
z by z 2 ? Sketch and interpret the resulting flow in the 
first quadrant. 

13. Change F(z) in Example 2 slightly to obtain a flow 
around a cylinder of radius r 0 that gives the flow in 
Example 2 if r 0 — > 1 . 

14. (Aperture) Show that F(z) = arccosh z gives confocal 
hyperbolas as streamlines, with foci at z = ± 1, and the 
flow may be interpreted as a flow through an aperture 
(Fig. 415). 

15. (Elliptical cylinder) Show that F(z) = arccos z gives 
confocal ellipses as streamlines, with foci at z = ± 1 , 
and that the flow circulates around an elliptic cylinder 
or a plate (the segment from —1 to 1 in Fig. 416). 


Fig. 415. Flow through an aperture in Problem 14 
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Fig. 416. Flow around a plate in Problem 15 



Fig. 417. Point source 


y 


i 



Fig. 418. Vortex flow 

16. TEAM PROJECT. Role of the Natural Logarithm 
in Modeling Flows, (a) Basic flows: Source and sink. 
Show that F{z) = (c/2'77) In z with constant positive 
real c gives a flow directed radially outward (Fig. 417), 
so that F models a point source at z = 0 (that is, a 
source line „v = 0, y = 0 in space) at which fluid is 
produced, c is called the strength or discharge of the 
source. If c* is negative real, show that the flow is 
directed radially inward, so that F models a sink at 
z = 0. a point at which fluid disappears. Note that 
- = 0 is the singular point of F(z)< 

(b) Basic flows: Vortex. Show that F(z) = -{KUItt) 
In z with positive real K gives a flow circulating 
counterclockwise around z = 0 (Fig. 418). z = 0 is 
called a vortex. Note that each time we travel around 
the vortex, the potential increases by K . 

(c) Addition of flows. Show that addition of the 
velocity vectors of two flows gives a flow whose 
complex potential is obtained by adding the complex 
potentials of those flows. 


(d) Source and sink combined. Find the complex 
potentials of a flow with a source of strength I at 
z = —ci and of a flow with a sink of strength 1 at 
z = Add both and sketch or graph the streamlines. 
Show that for small |a| these lines look similar to those 
in Prob. 1 1 . 

(e) Flow with circulation around a cylinder. Add the 
potential in (b) to that in Example 2, Show that this gives 
a flow for which the cylinder wall |r.| = 1 is a streamline. 
Find the speed and show that the stagnation points are 



if K — 0 they are at ± 1; as K increases they move up 
on the unit circle until they unite at z = / (K = 47t, see 
Fig. 419), and if K > Air they lie on the imaginary axis 
(one lies in the field of flow and the other one l ies inside 
the cylinder and has no physical meaning). 



Fig. 419. Flow around a cylinder without 
circulation ( K = 0) and with circulation 
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18 .! Poisson’s Integral Formula for Potentials 

So far in this chapter we have seen that complex analysis offers powerful methods for 
modeling and solving two-dimensional potential problems based on conformal mappings 
and complex potentials. A further method results from complex integration. As a most 
important result it yields Poisson’s integral formula (5) for potentials in a standard domain 
(a circular disk) and from (5) a useful series (7) for these potentials. Hence we can solve 
problems for disks and then map solutions conformally onto other domains. 

Poisson’s formula will follow from Cauchy’s integral formula (Sec. 14.3) 


(I) 


1 r F(z *) 


dz*. 


Here C is the circle z* = Re ta (counterclockwise, 0 ^ a ^ 2tt), and we assume that F(z*) 
is analytic in a domain containing C and its full interior. Since dz* = iRe la da = iz* da, 
we obtain from (1) 


( 2 ) 



(z* = Re m , z — re lH ). 


Now comes a little trick. If instead of z inside C we take a Z outside C, the integrals (1) 
and (2) are zero by Cauchy’s integral theorem (Sec. 14.2). We choose Z = z*z*/z = R 2 /z> 
which is outside C because |Z| = l^/\z\ = R 2 /r > R. From (2) we thus have 



and by straightforward simplification of the last expression on the right. 



We subtract this from (2) and use the following formula that you can verify by direct 
calculation (zz* cancels): 


( 3 ) 

We then have 

(4) 


r 4 “ z z - z* 




z*;* - zz 
(Z* - z)(z* - z) 


rv x 1 f~" r*/ Z*Z* ~ ZZ 

F(Z) ~ 2^1 F(Z ) da ■ 


From the polar representations of z and z* we see that the quotient in the integrand is real 
and equal to 

R 2 - r 2 _ R 2 - ,• 2 

(Re ia - re iu )(Re- ia - re~ ie ) ~ R 2 - 2Rr cos (6 - a) + r 2 ' 
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We now write F(z) = 4>(r, 0) + /^(r, 0) and take the real part on both sides of (4). 
Then we obtain Poisson’s integral formula 2 * * 


(5) 


*> - ^ a) 


fl 2 - r 2 

2Rr cos (0 — a) + r 2 


da. 


This formula represents the harmonic function $ in the disk |z| ^ i? in terms of its values 
0>(R y a) on the boundary (the circle) \z\ = R. 

Formula (5) is still valid if the boundary function <&(/?, a) is merely piecewise continuous 
(as is practically often the case; see Fig. 401 in Sec. 18.2 for an example). Then (5) gives 
a function harmonic in the open disk, and on the circle \z\ = R equal to the given boundary 
function, except at points where the latter is discontinuous. A proof can be found in 
Ref. [Dl] in App. 1. 


Series for Potentials in Disks 

From (5) we may obtain an important series development of in terms of simple harmonic 
functions. We remember that the quotient in the integrand of (5) was derived from (3). 
We claim that the right side of (3) is the real part of 

z* + z _ (z* + z)(z* - z) = z*z* - zz - z*z + zz* 
z* — z (z* - z)(z* - z) |z* - z| 2 


Indeed, the last denominator is real and so is z*z* - zz in die numerator, whereas 
— z*Z + zz* — 2/ 1m (zz*) in the numerator is pure imaginary. This verifies our claim. 
Now by the use of the geometric series we obtain (develop the denominator) 


( 6 ) 


z* + z 

r* — r 


1 + (z/z*) 
1 - (z/z*) 




Since - = re 10 and z* — Re™, we have 


Re 




cos (nd — na). 


On the right, cos (nd — na) — cos n8 cos na + sin nd sin na. Hence from (6) we obtain 


( 6 *) 


Re 


Z* + ; 

z* — z 


■••s-w 

CC / r \n 

1 4- 2 2 ( “ j (cos nd cos na + sin nd sin na). 


2 SJM£0N DENIS POISSON { 1 78 1-1840), French mathematician and physicist, professor in Paris from 1 809. 

His work includes potential theory, partial differential equations (Poisson equation. Sec. 12.1), and probability 

(Sec. 24.7). 
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EXAMPLE 1 


This expression is equal to the quotient in (5), as we have mentioned before, and by 
inserting it into (5) and integrating term by term with respect to a from 0 to 2tt we obtain 


(7) 


<D(r, 0) = a 0 + 


j (ii- 


cos n6 + b n sin nd) 


where the coefficients are [the 2 in (6*) cancels the 2 in I/(27 t) in (5)] 

r&TT 1 JZ IT 

27T *^o ' 7T j o 


1 r 1 r 

o 0 = — a) da, a n = — $(/?, a) cos na da, 

27 r *'o 7 t J 0 

(8) 2 n = l, 2, 

1 r 

b n = — I a) sin na da, 

7T •'o 


the Fourier coefficients of a); see Sec. 1 LI. Now for r = R the series (7) becomes 
the Fourier series of <$>(/?, a). Hence the representation (7) will be valid whenever the 
given <!>(/?, a) on the boundary can be represented by a Fourier series. 


Dirichlet Problem for the Unit Disk 

Find the electrostatic potential 0(r, 0) in the unit disk r < I having the boundary values 

f — CX.l'TT if — 7T<a<0 

aht if 0 < a < it 


<h(l. a) = | 


Hence, a n - -4 /(/Tar) if n is odd, a n = 0 if n = 2, 4, • • • , and the potential is 
d>(r. 

Figure 421 shows the unit disk and some of the equipotential lines (curves O = const). 


1 4 T r 3 r 5 I 

[r, 0) = — 2 I r cos 0 + — cos 30 + — cos 50 + • • • . 

2 7r L 3 5 J 


(Fig. 420). 


Solution . Since 0(1, a) is even, b n = 0, and from (8) we obtain a 0 = \ and 

f° a fa 1 2 

— | — cos na da + — cos /ior = « « (cos «7T — 1). 

7T J Q 7T J 


0(1, a) 


-IT 0 k a 

Fig. 420. Boundary values in Example 1 



Fig. 421. Potential in Example 1 
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1. Verify (3). 

2. Show that every term in (7) is a harmonic function in 
the disk r < R. 

3. Give the details of the derivation of the series (7) from 
the Poisson formula (5). 

|4-I3| HARMONIC FUNCTIONS IN A DISK 

Using (7), find the potential <D(/\ 0) in the unit disk r < 1 
having the given boundary values 4>(l t 0). Using the sum 
of the first few terms of the series, compute some values 
of <l> and sketch a figure of the equipotential lines. 

4. 3>(1, 0) = sin 20 

5. cl>(l, 0) = 2 sin 2 0 

6. <I>(I, 6) = cos 2 50 

7. <£(1, 0) = 0 if —77 < 0 < 77 

8. <b(l, 0) = 0ifO < 0< 27 r 

9. <I>(I, 0) = sin 3 20 

10. 3>(1, 0) = cos 4 0 

11. 3>(1, 0) = 0 2 if — 7T < 0 < 77 

12. <D(1, 0) = I if “^77 < 0<§77, 

<t>(f 0) = 0 if^77< 0<§7T 


13. ®(1, 0) = 0 if T <0< ^77, 

<£(1.0) = 77 - 0if^77< 0< §77 

14. TEAM PROJECT. Potential in a Disk, (a) Mean 
value property. Show that the value of a harmonic 
function at the center of a circle C equals the mean 
of the value of <t> on C (see Sec. 18.4, footnote 1, for 
definitions of mean values). 

(b) Separation of variables. Show that the terms of 
(7) appear as solutions in separating the Laplace 
equation in polar coordinates. 

(c) Harmonic conjugate. Find a series for a harmonic 
conjugate of <I> from (7). 

(d) Power series. Find a series for F(z) = <3> + /'P\ 

15. CAS EXPERIMENT. Series (7). Write a program for 
series developments (7). Experiment on accuracy by 
computing values from partial sums and comparing 
them with values that you obtain from your CAS graph. 
Do this (a) for Example 1 and Fig. 421, (b) for O in 
Prob. 8 (which is discontinuous on the boundary!), 
(c) for a <f> of your choice with continuous boundary 
values, (d) for with discontinuous boundary values. 


18.6 General Properties of Harmonic Functions 

General properties of harmonic functions can often be obtained from properties of analytic 
functions in a simple fashion. Specifically, important mean value properties of harmonic 
functions follow readily from those of analytic functions. The details are as follows. 


THEOREM 1 


Mean Value Property of Analytic Functions 

Let f(z ) be analytic in a simply connected domain D. Then the value of F(z ) at a 
point Zo hi O is equal to the mean value of F(z) on any circle in D with, center at z 0 . 


PROOF In Cauchy’s integral formula (Sec. 14.3) 


( 1 ) 


1 f F(: 

FUo) = f — 

277/ J C Z - Z 0 


F(Z) 


dz 


we choose for C the circle z — Zo + re ia in D. Then z — Zo = re ta , dz — ire icc do r, and 
(1) becomes 
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The right side is the mean value of F on the circle (= value of the integral divided by the 
length 277 of the interval of integration). This proves the theorem. ■ 

For harmonic functions. Theorem 1 implies 


THEOREM 2 


Two Mean Value Properties of Harmonic Functions 

Let <D(jc, y) be harmonic in a simply connected domain D. Then the value of 
<I>(a\ y) at a point (a 0 , y 0 ) in D is equal to the mean value of 3>(a, > t ) on any circle 
in D with center at (a* 0 , y 0 ). This value is also equal to the mean value of <b(A\ >•) 
on any circular disk in D with center (a* 0 , y 0 ). [See footnote 1 in Sec. 18.4.] 


PROOF The first part of the theorem follows from (2) by taking the real parts on both sides, 

1 r 2 " 

®C*b. y 0 ) = Re F(x o 4- o’o) = — I <t>(*o + r cos a, )> 0 + r sm a) da. 

Z77 


The second part of the theorem follows by integrating this formula over r from 0 to r 0 
(the radius of the disk) and dividing by r 0 2 / 2, 


(3) 


*(a*o, y 0 ) = 


1 rV" 

2 I I + r cos a, >- 0 + 

Wq J o J o 


r sin a)rda dr. 


The right side is the indicated mean value (integral divided by the area of the region of 
integration). ■ 


Returning to analytic functions, we state and prove another famous consequence of 
Cauchy’s integral formula. The proof is indirect and shows quite a nice idea of applying 
the A/L-inequality. (A bounded region is a region that lies entirely in some circle about 
the origin.) 


THEOREM 3 


Maximum Modulus Theorem for Analytic Functions 

Let F(z) be analytic and nonconstant in a domain containing a bounded region R 
and its boundary. Then the absolute value |F(z)| cannot have a maximum at an 
interior point ofR. Consequently , the maximum of\F(z)\ is taken on the boundary 
of R. If F(z) ^ 0 in R, the same is true with respect to the minimum of\F(z)\- 


PROOF We assume that |F(z)| has a maximum at an interior point zo of R and show that this leads 
to a contradiction. Let |F(zo)| — M be this maximum. Since F(z) is not constant, |F(z)| is 
not constant, as follows from Example 3 in Sec. 13.4. Consequently, we can find a circle 
C of radius r with center at zq such that the interior of C is in R and |F(z)| is smaller than 
M at some point P of C. Since |F(z)| is continuous, it will be smaller than M on an arc 
C x of C that contains P (see Fig. 422), say. 


\F(z)\ £M-k (k> 0) 


for all z on C x . 
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THEOREM 4 


PROOF 



Fig. 422. Proof of Theorem 3 


Let C x have the length L x . Then the complementary arc C 2 of C has the length 2 irr — L x . 
We now apply the A/L-inequality (Sec. 14.1) to (1) and note that \z — Zo I = r. We then 
obtain (using straightforward calculation in the second line of the formula) 



* + 2i 


f -£2.* 

J c 2 z “ Zo 


(?) 


(2rrr - L x ) = M - 


kL x 

2irr 


< M 


that is, M < M, which is impossible. Hence our assumption is false and the first statement 
is proved. 

Next we prove the second statement. If F(z) ^ 0 in R, then 1 lF{z) is analytic in R. 
From the statement already proved it follows that the maximum of l/|F(z)| lies on the 
boundary of R. But this maximum corresponds to the minimum of |F(z)|. This completes 
the proof. ■ 


This theorem has several fundamental consequences for harmonic functions, as follows. 


Harmonic Functions 

Let c D( jc, y) be harmonic in a domain containing a simply connected bounded region 
R and its boundaiy curve C. Then: 

(I) (Maximum principle) If <h(x, y) is not constant ; it has neither a maximum 
nor a minimum in R. Consequently , the maximum and the minimum are taken on 
the boundary of R. 

(II) If d>(*, y) is constant on C, then <J>(x, y) is a constant. 

(III) If h(x, y) is harmonic in R and on C and if h(x, y) = <h(x, y) on C, then 
h(x, y) = 0(x, y) everywhere in R. 


(I) Let ^(Xj y) be a conjugate harmonic function of $(x, y) in R. Then the complex 
function F(z ) = ^(a', y) + /^(a’, y) is analytic in R, and so is G(z) = e nz> . Its absolute 
value is 

)G(z)\ = e ReFC *> = €****>. 

From Theorem 3 it follows that \G(z)\ cannot have a maximum at an interior point oiR. 
Since e 0 is a monotone increasing function of the real variable <5, the statement about the 
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maximum of 4> follows. From this, the statement about the minimum follows by replacing 
O by -4>. 

(II) By (1) the function y) takes its maximum and its minimum on C. Thus, if 
<E>(a\ y) is constant on C, its minimum must equal its maximum, so that 4 >(jc, y) must be 
a constant. 

(IH) If h and <I> are harmonic in R and on C, then h — is also harmonic in R and 
on C, and by assumption, h — $ = 0 everywhere on C. By (II) we thus have h - = 0 

everywhere in R> and (HI) is proved. ■ 

The last statement of Theorem 4 is very important. It means that a harmonic function is 
uniquely determined in R by its values on the boundary of R. Usually, y) is required 
to be harmonic in R and continuous on the boundary of R , that is, 

lim <$>(*:, y) = <t>(A 0 , y 0 ), where (a 0 , y 0 ) is on the boundary and (a, y) is in R. 

x-+x 0 

3/— 2/o 


Under these assumptions the maximum principle (I) is still applicable. The problem of 
determining $(a\ y) when the boundary values are given is called the Dirichlet problem 
for the Laplace equation in two variables, as we know. From (III) we thus have, as a 
highlight of our discussion. 


THEOREM 5 


Uniqueness Theorem for the Dirichlet Problem 

If for a gi ven region and given boundary values the Dirichlet problem for the Laplace 
equation in two variables has a solution , the solution is unique. 




1. Integrate \z\ 2 around the unit circle. Does your result 
contradict Theorem J ? 

1^-4 1 VERIFY THEOREM 1 for the given F(z ), co> and 
circle of radius l. 

2. (z + l) 3 , = 2 

3. (- - 2) 2 , zo = 1 

4. 10- 4 , £q = 0 

VERIFY THEOREM 2 for the given y), 

(a 0 , y 0 ) and circle of radius 1 . 

5. (.v - 2 )(y - 2). (4, -4) 

6. .v 2 - y 2 (3. 8) 

7. .v 3 — 3 a*v 2 (1,1) 

8. Derive Theorem 2 from Poisson’s integral formula. 

9. CAS EXPERIMENT. Graphing Potentials. Graph 
the potentials in Probs. 5 and 7 and for three other 


functions of your choice as surfaces over a rectangle 
or a disk in the A*y-plane. Find the locations of maxima 
and minima by inspecting these graphs. 

10. TEAM PROJECT. Maximum Modulus of Analytic 
Functions, (a) Verify Theorem 3 for (i) F(z) = z 2 and 
the square 4 ^ a* ^ 6, 2 ^ v = 4, (ii) F(z) = e 3z and 
any bounded domain, (iii) F(z) = sin z and the unit 
disk. 

(b) F( x ) = cos a* (a real) has a maximum 1 at 0. 
How does it follow that this cannot be a maximum of 
|F(s)| = |cos z\ in a domain containing z = 0? 

(c) F(z) = 1 + |z| 2 is not zero in the disk |z| ^ 4 and 
has a minimum at an interior point. Does tills contradict 
Theorem 3? 

(d) If F(z) is analytic and not constant in the closed 
unit disk D: | ' | ^ 1 and |F(z)| = c = const on the unit 
circle, show that F(z) must have a zero in D. Can you 
extend this to an arbitrary simple closed curve? 
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MAXIMUM MODULUS 

Find the location and size of the maximum of \F(z)\ in the 
unit disk \z\ = I. 

11. F(z) = z 2 - 1 

12. F(z) =■ az + b («, b complex) 

13. F(z) — cos 2 z 

14. Verify the maximum principle for 3>(.\\ y) = e x cos y 
and the rectangle a ^ x ^ b* 0 ^ y ^ 2 tt. 


15. (Conjugate) Do and a harmonic conjugate of in 
a region R have their maximum at the same point of /?? 

16. (Conformal mapping) Find the location (« x , y x ) of the 
maximum of <f>* = e u cos y in /?*: |w| ^ 1, v ^ 0, 
where vt* = it + tv. Find the region R that is mapped 
onto R* by w = f(z) — z 2 . Find the potential in R 
resulting from <I>* and the location (jr,. Vi) of the 
maximum. Is (« lf Uj) the image of (.v lt y x )? If so, is 
this just by chance? 


SJEr-tR^t8^REV3iE^WE^UE S T I O N S AND PROBLEMS 


1. Why can potential problems be modeled and solved by 
complex analysis? For what dimensions? 

2. What is a harmonic function ? A harmonic conjugate? 

3. Give a few examples of potential problems considered 
in this chapter. 

4. What is a complex potential? What does it give 
physically? 

5. How can conformal mapping be used in connection with 
the Dirichlet problem? 

6. What heat problems reduce to potential problems? Give 
a few examples. 

7. Write a short essay on potential theory in fluid flow 
from memory. 

8. What is a mixed boundary value problem? Where did 
it occur? 

9. State Poisson’s formula and its derivation from 
Cauchy’s formula. 

10. State the maximum modulus theorem and mean value 
theorems for harmonic functions. 

11. Find die potential and complex potential between the 
plates y = a and v = a* + 10 kept at 10 V and 110 V, 
respectively. 

12. Find the potential between the cylinders \z\ = 1 cm 
having potential 0 and \z\ = 10 cm having potential 20 
kV. 

13. Find the complex potential in Prob. 12. 

14. Find the equipotential line if = 0 V between the 
cylinders |z| = 0.25 cm and \z\ — 4 cm kept at -220 V 
and 220 V. respectively. (Guess first.) 

15. Find the potential between the cylinders \z\ = 10 cm 
and |s| = 100 cm kept at the potentials 10 kV and 0, 
respectively. 

16. Find the potential in the angular region between the 
plates Arg z — tt/ 6, kept at 8 kV, and Arg z = tt/3, kept 
at 6 kV. 


17. Find the equipotential lines of F(z) = / Ln z . 

18. Find and sketch the equipotential lines of 
F(z) = (1 + iVz. 

19. What is the complex potential in the upper half-plane 
if the negative half of the .v-axis has potential 1 kV and 
the positive half is grounded? 

20. Find the potential on the ray y = a. a > 0, and on 
the positive half of the A-axis if the positive half of 
the y-axis is at 1200 V and the negative half is 
grounded. 

21. Interpret Prob. 20 as a problem in heat conduction. 

22. Find the temperature in the upper half-plane if the 
portion x > 2 of the A-axis is kept at 50°C and the other 
portion at 0°C. 

23. Show that the isotherms of F(z) = ~~iz 2 + z are 
hyperbolas. 

24. If the region between two concentric cylinders of radii 
2 cm and 10 cm contains water and the outer cylinder 
is kept at 20°C, to what temperature must we heat the 
inner cylinder in order to have 30°C at distance 5 cm 
from the axis? 

25. What are the streamlines of F{z) = /A;? 

26. What is the complex potential of a flow around a 
cylinder of radius 4 without circulation? 

27. Find the complex potential of a source at z = 5. What 
are the streamlines? 

28. Find the temperature in the unit disk |z| ^ I in the form 
of an infinite series if the left semicircle of |z| = 1 has 
the temperature of 50°C and the right semicircle has the 
temperature 0°C. 

29. Same task as in Prob. 28 if the upper semicircle is at 
40°C and the lower at 0°C. 

30. Find a series for the potential in the unit disk with 
boundary values <3>(l, 0) = 0 2 (- 7 r < 6 < 77 ). 
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Complex Analysis and Potential Theory 


Potential theory is the theory of solutions of Laplace’s equation 

(1) V 2 <t> = 0. 

Solutions whose second partial derivatives are continuous are called harmonic 
functions. Equation ( 1 ) is the most important PDE in physics, where it is of interest 
in two and three dimensions. It appears in electrostatics (Sec. 18.1), steady-state heat 
problems (Sec. 18.3), fluid flow (Sec. 18.4), gravity, etc. Whereas the three-dimensional 
case requires other methods (see Chap. 12), two-dimensional potential theory can 
be handled by complex analysis, since the real and imaginary parts of an analytic 
function are harmonic (Sec. 1 3.4). They remain harmonic under conformal mapping 
(Sec. 18.2), so that conformal mapping becomes a powerful tool in solving 
boundary value problems for (1), as is illustrated in this chapter. With a real potential 
<l> in (1) we can associate a complex potential 

(2) F(z) = 4> + /¥ (Sec. 18.1). 

Then both families of curves <t> = const and = const have a physical meaning. 
In electrostatics, they are equipotential lines and lines of electrical force (Sec. 18.1). 
In heat problems, they are isotherms (curves of constant temperature) and lines of 
heat flow (Sec. 18.3). In fluid flow, they are equipotential lines of the velocity 
potential and streamlines (Sec. 18.4). 

For the disk, the solution of the Dirichlet problem is given by the Poisson formula 
(Sec. 18.5) or by a series that on the boundary circle becomes the Fourier series of 
the given boundary values (Sec. 18.5). 

Harmonic functions, like analytic functions, have a number of general properties; 
particularly important are the mean value property and the maximum modulus 
property (Sec. 18.6), which implies the uniqueness of the solution of the Dirichlet 
problem (Theorem 5 in Sec. 18.6). 




PART 


Numeric 
Analysis 

Software (p. 778-779) 

CHAPTER 19 Numerics in General 
CHAPTER 20 Numeric Linear Algebra 
CHAPTER 21 Numerics for ODEs and PDEs 

Numeric analysis, more briefly also called numerics, concerns numeric methods, that 
is, methods for solving problems in terms of numbers or corresponding graphical 
representations. It also includes the investigation of the range of applicability and of the 
accuracy and stability of these methods. 

Typical tasks for numerics are the evaluation of definite integrals, the solution of equations 
and linear systems, the solution of differential or integral equations for which there are 
no solution formulas, and the evaluation of experimental data for which we want to obtain, 
for example, an approximating polynomial. 

Numeric methods then provide the transition from the mathematical model to an 
algorithm, which is a detailed stepwise recipe for solving a problem of the indicated kind 
to be programmed on your computer, using your CAS (computer algebra system) or other 
software, or on your programmable calculator. 

In this and the next two chapters we explain and illustrate the most frequently used basic 
numeric methods in algorithmic form. Chapter 19 concerns numerics in general; Chap. 20 
numeric linear algebra, in particular, methods for linear systems and matrix eigenvalue 
problems; and Chap. 21 numerics for ODEs and PDEs. 

The algorithms are given in a form that seems best for showing how a method works. We 
suggest that you also make use of programs from public-domain or commercial software 
listed on pp. 778-779 or obtainable on the Internet. 




Ill 
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Numerics has increased in importance to the engineer more than any other field of 
mathematics owing to the ongoing development of powerful software resulting from great 
research activity in numerics: new methods are invented, existing methods are improved 
and adapted, and old methods — impractical in precomputer times — are rediscovered. A 
main goal in these activities is the development of well-structured software. And in large- 
scale work — millions of equations or steps of iteration — even small algorithmic 
improvements may have a large effect on computing time, storage demand, accuracy, and 
stability. 

On average this makes the algorithms used in practice more and more complicated. 
However , the more sophisticated modern software will become , the more important it 
will be to understand concepts and algorithms in a basic form that shows original 
motivating ideas of recent developments. 

To avoid misunderstandings: Various simple classical methods are still very useful in 
many routine situations and produce satisfactory results. In other words, not everything 
has become more sophisticated. 

Software 

See also http://www.wiley.com/college/kreyszig/ 

The following list will help you if you wish to find software. You may also obtain 
information on known and new software from magazines, such as Byte Magazine or PC 
Magazine , from articles published by the American Mathematical Society (see also their 
website at www.ams.org), the Society for Industrial and Applied Mathematics (SIAM, at 
www.siam.org), the Association for Computing Machinery (ACM, at www.acm.org), or 
the Institute of Electrical and Electronics Engineers (IEEE, at www.ieee.org). Consult 
also your library, Computer Science Department, or Mathematics Department. 

Derive. Texas Instruments, Inc., Dallas, TX. Phone 1-800-842-2737 or (972) 917-8324, 
website at www.derive.com or www.education.ti.com. 

EISPACK. See LAPACK. 

GAMS (Guide to Available Mathematical Software). Website at http://gams.nist.gov. 
On-line cross-index of software development by NIST, with links to IMSL, NAG, and 
NETLIB. 

IMSL (International Mathematical and Statistical Library). Visual Numerics, Inc., 
Houston, TX. Phone 1-800-222-4675 or (713) 784-3131, website at www.vni.com. 
Mathematical and statistical Fortran routines with graphics. 

LAPACK. Fortran 77 routines for linear algebra. This software package supersedes 
LINPACK and EISPACK. You can download the routines 

(see http://cm.bell-labs.com/netlib/bib/mirrors.html) or order them directly from NAG. 
The LAPACK User’s Guide is available at www.netlib.org. 
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UNPACK see LAPACK 

Maple. Waterloo Maple, Inc., Waterloo, ON, Canada. Phone 1-800-267-6583 or 
(519) 747-2373, website at www.maplesoft.com. 

Maple Computer Guide. For Advanced Engineering Mathematics, 9th edition. By 
E. Kreyszig and E. J. Norminton. J. Wiley and Sons, Inc., Hoboken, NJ. Phone 
1 -800-225-5945 or (20 1 ) 748-6000. 

Mathcad. MathSoft, Inc., Cambridge, MA. Phone 1-800-628-4223 or (617) 444-8000. 
website at www.mathcad.com or www.mathsoft.com. 

Mathematica. Wolfram Research, Inc., Champaign, IL. Phone 1-800-965-3726 or 
(217) 398-0700, website at www.wolframresearch.com. 

Mathematica Computer Guide. For Advanced Engineering Mathematics, 9th 
edition. By E. Kreyszig and E. J. Norminton. J. Wiley and Sons, Inc., Hoboken, NJ. Phone 
1-800-225-5945 or (201) 748-6000. 

Matlab. The MathWorks, Inc., Natick, MA. Phone (508) 647-7000, website at 
www.mathworks.com. 

NAG. Numerical Algorithms Group, Inc., Downders Grove, IL. Phone (630) 971-2337, 
website at www.nag.com. Numeric routines in Fortran 77, Fortran 90, and C. 

NETLIB. Extensive library of public-domain software. See at www.netlib.org and 
http://cm.bell-labs.com/netlib/. 

NIST. National Institute of Standards and Technology, Gaithersburg, MD. Phone 
(301) 975-2000, website at www.nist.gov. For Mathematical and Computational Science 
Division phone (301) 975-3800. See also http://math.nist.gov. 

Numerical Recipes. Cambridge University Press, New York, NY. Phone (212) 924-3900, 
website at www.us.cambridge.org. Books (also source codes on CD ROM and 
discettes) containing numeric routines in C, C + '\ Fortran 77, and Fortran 90. To order, 
call office at West Nyack, NY, at 1-800-872-7423 or (845) 353-7500 or online at 
www.numerical-recipes.com. 


FURTHER SOFTWARE IN STATISTICS. See Part G. 




CHAPTER 1 9 
Numerics in General 


This first chapter on numerics begins with an explanation of some general concepts, such 
as floating point, roundoff errors, and general numeric errors and their propagation. In 
Sec. 19.2 we discuss methods for solving equations. Interpolation methods, including 
splines, follow in Secs. 19.3 and 19.4. The last section (19.5) concerns numeric integration 
and differentiation. 

The purpose of this chapter is twofold. First, for all these tasks the student should 
become familiar with the most basic (but not too complicated) numeric solution methods. 
These are indispensable for the engineer, because for many problems there is no solution 
formula (think of a complicated integral or a polynomial of high degree or the interpolation 
of values obtained by measurements). In other cases a complicated solution formula may 
exist but may be practically useless. 

Second, the student should learn to understand some basic ideas and concepts that are 
important throughout numerics, such as the practical form of algorithms, the estimation 
of errors, and the order of convergence. 

Prerequisite: Elementary calculus 

References and Answers to Problems: App. 1 Part E, App. 2 


19.1 Introduction 

Numeric methods are used to solve problems on computers or calculators by numeric 
calculations, resulting in a table of numbers and/or graphical representations (figures). The 
steps from a given situation (in engineering, economics, etc.) to the final answer are usually 
as follows. 

1. Modeling. We set up a mathematical model of our problem, such as an integral, a 
system of equations, or a differential equation. 

2. Choosing a numeric method and parameters (e.g., step size), perhaps with a 
preliminary error estimation. 

3. Programming. We use the algorithm to write a corresponding program in a CAS, 
such as Maple, Mathematica, Matlab, or Mathcad, or, say, in Fortran, C, or C ++ , 
selecting suitable routines from a software system as needed. 

4. Doing the computation. 

5. Interpreting the results in physical or other terms, also deciding to rerun if further 
results are needed. 
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Steps 1 and 2 are related. A slight change of the model may often admit of a more 
efficient method. To choose methods, we must first get to know them. Chapters 19—21 
contain efficient algorithms for the most important classes of problems occurring 
frequently in practice. 

In Step 3 the program consists of the given data and a sequence of instructions to be 
executed by the computer in a certain order for producing the answer in numeric or graphic 
form. 

To create a good understanding of the nature of numeric work, we continue in this 
section with some simple general remarks. 

Floating-Point Form of Numbers 

We know that in decimal notation, every real number is represented by a finite or an 
infinite sequence of decimal digits. Now most computers have two ways of representing 
numbers, called fixed point and floating point. In a fixed-point system all numbers are 
given with a fixed number of decimals after the decimal point; for example, numbers 
given with 3 decimals are 62.358, 0.014, 1 .000. In a text we would write, say, 3 decimals 
as 3D. Fixed-point representations are impractical in most scientific computations because 
of their limited range (explain!) and will not concern us. 

In a floating-point system we write, for instance, 

0.6247 • 10 3 , 0.1735 • 10" 13 , -0.2000 • 10" 1 

or sometimes also 

6.247 • 10 2 , 1.735 • lO -14 , -2.000 • 1(T 2 . 

We see that in this system the number of significant digits is kept fixed, whereas the 
decimal point is “floating.” Here, a significant digit of a number c is any given digit of 
c, except possibly for zeros to the left of the first nonzero digit; these zeros serve only to 
fix the position of the decimal point. (Thus any other zero is a significant digit of c.) For 
instance, each of the numbers 


1360, 1.360, 0.001360 

has 4 significant digits. In a text we indicate, say, 4 significant digits, by 4S. 

The use of exponents permits us to represent very large and very small numbers. Indeed, 
theoretically any nonzero number a can be written as 

(1) a = ±m • 10 n , 0.1 ^ \m\ < 1, n integer. 

On the computer, m is limited to k digits (e.g., k = 8) and n is limited, giving representations 
(for finitely many numbers only!) 

(2) a = ±m • 1 0 n , m = 0 xl x d 2 * * * d kJ d l > 0. 

These numbers d are often called k-digit decimal machine numbers. Their fractional part 
m (or m) is called the mantissa. This has nothing to do with “mantissa” as used for 
logarithms, n is called the exponent of d. 
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Underflow and Overflow. The range of exponents that a typical computer can handle 
is very large. The IEEE (Institute of Electrical and Electronics Engineers) floating-point 
standard for single precision (the usual number of digits in calculations) is about 
—38 < n < 38 (about — 125 < n* < 125 for the exponent in binary representations, 
i.e., representations in base 2). [For so-called double precision it is about —308 <n< 308 
(about -1020 < /i* < 1020 for binary).] If in a computation a number outside that range 
occurs, this is called underflow when the number is smaller and overflow when it is 
larger. In the case of underflow the result is usually set to zero and computation continues. 
Overflow causes the computer to halt. Standard codes (by IMSL, NAG, etc.) are written 
to avoid overflow. Error messages on overflow may then indicate programming errors 
(incorrect input data, etc.). 


Roundoff 

An error is caused by chopping (= discarding all decimals from some decimal on) or 
rounding. This error is called roundoff error, regardless of whether we chop or round. 
The rule for rounding off a number to k decimals is as follows. (The rule for rounding 
off to k significant digits is the same, with “decimal” replaced by “significant digit.”) 

Roundoff Rule. Discard the (k -I- l)th and all subsequent decimals, (a) If the number 
thus discarded is less than half a unit in the £th place, leave the kth decimal unchanged 
(" rounding down”), (b) If it is greater than half a unit in the fcth place, add one to the A'th 
decimal (“ rounding up”), (c) If it is exactly half a unit, round off to the nearest even 
decimal. (Example: Rounding off 3.45 and 3.55 to 1 decimal gives 3.4 and 3.6, 
respectively.) 

The last part of the rule is supposed to ensure that in discarding exactly half a decimal, 
rounding up and rounding down happens about equally often, on the average. 

If we round off 1.2535 to 3, 2, 1 decimals, we get 1.254, 1.25, 1.3, but if 1.25 is rounded 
off to one decimal, without further information, we get 1 . 2 . 

Chopping is not recommended because the corresponding error can be larger than that 
in rounding, and is systematic. (Nevertheless, some computers use it because it is simpler 
and faster. On the other hand, some computers and calculators improve accuracy of results 
by doing intermediate calculations using one or more extra digits, called guarding digits.) 

Error in Rounding. Let d = /7(a) in (2) be the floating-point computer approximation 
of a in (1 ) obtained by rounding, where // suggests floating. Then the roundoff rule gives 
(by dropping exponents) \m — m\ ^ 5 • 10“ ,c . Since |w| ^0.1, this implies (when a ¥* 0) 


(3) 


a — a 
a 


m — m 
m 



10 1 "'*. 


The right side u = \ * lO 1- ”* is called the rounding unit. If we write d = a(\ + S), we 
have by algebra (< d — a)! a = S, hence |5| ^ u by (3). This shows that the rounding unit 
11 is an error bound in rounding. 

Rounding errors may ruin a computation completely, even a small computation. In 
general, these errors become the more dangerous the more arithmetic operations (perhaps 
several millions!) we have to perform. It is therefore important to analyze computational 
programs for expected rounding errors and to find an arrangement of the computations 
such that the effect of rounding errors is as small as possible. 

The arithmetic in a computer is not exact either and causes further errors; however, 
these will not be relevant to our discussion. 
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Accuracy in Tables. Although available software has rendered various tables of function 
values superfluous, some tables (of higher functions, of coefficients of integration 
formulas, etc.) will still remain in occasional use. If a table shows k significant digits, it 
is conventionally assumed that any value a in the table deviates from the exact value a 
by at most ±\ unit of the kth digit. 


Algorithm. Stability 

Numeric methods can be formulated as algorithms. An algorithm is a step-by-step 
procedure that states a numeric method in a form (a “pseudocode”) understandable to 
humans. (Turn pages to see what algorithms look like.) The algorithm is then used to 
write a program in a programming language that the computer can understand so that it 
can execute the numeric method. Important algorithms follow in the next sections. For 
routine tasks your CAS or some other software system may contain programs that you 
can use or include as parts of larger programs of your own. 

Stability. To be useful, an algorithm should be stable; that is, small changes in the 
initial data should cause only small changes in the final results. However, if small changes 
in the initial data can produce large changes in the final results, we call the algorithm 

unstable. 

This “numeric instability , " which in most cases can be avoided by choosing a better 
algorithm, must be distinguished from “mathematical instability ” of a problem, which is 
called “ ill-conditioning , ” a concept we discuss in the next section. 

Some algorithms are stable only for certain initial data, so that one must be careful in 
such a case. 


Errors of Numeric Results 

Final results of computations of unknown quantities generally are approximations; that 
is, they are not exact but involve errors. Such an error may result from a combination of 
the following effects. Roundoff errors result from rounding, as discussed on p. 782. 
Experimental errors are errors of given data (probably arising from measurements). 
Truncating errors result from truncating (prematurely breaking off), for instance, if we 
replace a Taylor series with the sum of its first few terms. These errors depend on the 
computational method used and must be dealt with individually for each method. 
[‘Truncating” is sometimes used as a term for chopping off (see before), a terminology 
that is not recommended.] 

Formulas for Errors. If a is an approximate value of a quantity whose exact value is 
a , we call the difference 

(4) e = a - a 

the error of a. Hence 

(4*) a = a - be, True value = Approximation + Error. 


For instance, if a = 10.5 is an approximation of a = 10.2, its error is e = -0.3. The 
error of an approximation a = 1.60 of a = 1.82 is € = 0.22. 
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CAUTION! In the literature | a — a\ (“absolute error”) or a — a are sometimes also 
used as definitions of error. 

The relative error e>. of a is defined by 


(5) 


e a — a _ Error 
a a True value 


( a * 0 ). 


This looks useless because a is unknown. But if |e| is much less than |2|, then we can use 
a instead of a and get 


(S') 




€ 

a 


This still looks problematic because e is unknown — if it were known, we could get 
a = a + e from (4) and we would be done. But what one often can obtain in practice is 
an error bound for 5, that is, a number (3 such that 

|e| = hence | a — a\ ^ (3. 


This tells us how far away from our computed a the unknown a can at most lie. Similarly, 
for the relative error, an error bound is a number /3 r such that 


1^1 ^ jSy, hence 




Error Propagation 

This is an important matter. It refers to how errors at the beginning and in later steps 
(roundoff, for example) propagate into the computation and affect accuracy, sometimes 
very drastically. We state here what happens to error bounds. Namely, bounds for the 
error add under addition and subtraction, whereas bounds for the relative error add under 
multiplication and division. You do well to keep this in mind. 


Error Propagation 

(a) in addition and subtraction , an error bound for the results is given by the 
sum of the error bounds for the terms. 

(b) In multiplication and division, an error bound for the relative error of the 
results is given ( approximately ) by the sum of the bounds for the relative errors 
of the given numbers . 


(a) We use the notations x = x + e lf y = y + e 2l tal g p l9 |e 2 | ^ /3 2 . Then for the error 
e of the difference we obtain 

M = !*->-(*- sol 

= - x - (y - 50| 

= Ml ~ «2l = Mil + M2I = ft + #2- 
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EXAMPLE 1 


The proof for the sum is similar and is left to the student. 

(b) For the relative error ^ of xy we get from the relative errors and e,. 2 of x 9 y 
and bounds /3 rl , 


kl 


xy - xy 


xy - (x- e x )(y - e 2 ) 


6 i 3 > + e 2 x - e x e 2 

xy 




xy 


£i y + fzx 

xy 



l«rll + M = Prl + PyZ- 


This proof shows what “approximately” means: we neglected € x e 2 as small in absolute 
value compared to |f x | and |e 2 |. The proof for the quotient is similar but slightly more 
tricky (see Prob. 15). ■ 


Basic Error Principle 

Every numeric method should be accompanied by an error estimate. If such a formula is 
lacking, is extremely complicated, or is impractical because it involves information (for 
instance, on derivatives) that is not available, the following may help. 

Error Estimation by Comparison. Do a calculation twice with different accuracy. 
Regard the difference d 2 ~ ci\ of the results a l9 a 2 as a {perhaps crude) estimate of the 
error of the inferior result a^. Indeed, d^ + = a 2 + e 2 by formula (4*). This implies 

a 2 “ a± = £i — e 2 e 1 because a 2 is generally more accurate than a l9 so that |e 2 | is 
small compared to lej. 

Loss of Significant Digits 

This means that a result of a calculation has fewer correct digits than the numbers from 
which it was obtained. This happens if we subtract two numbers of about the same size, 
for example, 0.1439 — 0.1426 (“subtractive cancellation”). It may occur in simple 
problems, but it can be avoided in most cases by simple changes of the algorithm — if one 
is aware of it! Let us illustrate this with the following basic problem. 

Quadratic Equation. Loss of Significant Digits 

Find the roots of the equation 

.v 2 - 40.v + 2 = 0, 

using 4 significant digits (abbreviated 4S) in the computation. 

Solution. A formula for the roots x v x 2 of a quadratic equation ax 2 + bx + c = 0 is 

(6) X\ = — (— b + Vfc 2 — 4ac), xo = ~ (~b — V& 2 — 4ac). 

2 a 2a 

Furthermore, since jc^ = cfa, another formula for those roots is 

c 

(7) x x as before, x 2 = . 

axj 

From (6) we obtain x = 20 ± V398 = 20.00 ± 19.95. This gives x x = 20.00 + 19.95 = 39.95, involving no 
difficulty, whereas ,v 2 = 20.00 — 19.95 = 0.05 is poor because it involves loss of significant digits. 

In contrast, (7) gives a*i = 39.95, .v 2 = 2.000/39.95 = 0.05006, in error by less than one unit of the last digit, 
as a computation with more digits shows. (The lOS-value is 0.05006265674.) 
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Comment . To avoid misunderstandings: 4S was used for convenience; (7) is better than (6) regardless of 
the number of digits used. For instance, the 8S-computation by (6) is .Vj = 39.949 937, x 2 = 0.050 063, which 
is poor, and by (7) it is x x as before. .v 2 = 2/xi = 0.050 062 657. 

In a quadratic equation with real roots, if ,v 2 is absolutely largest (because b > 0). use (6) for a 2 and then 
A'! = c/(ax 2 ). ■ 


UisssSSSSSSSSii 


1. (Floating point) Write 98.17, -100.987, 0.0057869, 
-13600 in floating-point form, rounded to 4S (4 
significant digits). 

2. Write -0.0286403, 1 1.25845, -31681.55 in floating- 
point form rounded to 6S. 

3. Small differences of large numbers may be 
particularly strongly affected by rounding errors. 
Illustrate this by computing 0.36443/(17.862 — 17.798) 
as given with 5S. then rounding stepwise to 4S, 3S, 
and 2S, where “stepwise” means: round the rounded 
numbers, not the given ones. 

4. Do the work in Prob. 3 with numbers of your choice 
that give even more drastically different results. How 
can you avoid such difficulties? 

5. The quotient in Prob. 3 is of the form al(b — c). Write 
it as a(b + c)l{b 2 — c 2 ). Compute it first with 5S, then 
rounding numerator 12.996 and denominator 2.28 
stepwise as in Prob. 3. Compare and comment. 

6. (Quadratic equation) Solve a 2 - 20.v + 1 = 0 by (6) 
and by (7), using 6S in the computation. Compare and 
comment. 

7. Do the computations in Prob. 6 with 4S and 2S. 

8. Solve a- 2 + 100 .v + 2 = 0 by (6) and (7) with 5S and 
compare. 

9. Calculate \le = 0.367879 (6S) from the partial sums 
of 5 to 10 terms of the Maclaurin series (a) of e“ ,T with 
a = 1, (b) of e x with a = 1 and then taking the 
reciprocal. Which is more accurate? 

10. Addition with a fixed number of significant digits 
depends on the order in which you add the numbers. 
Illustrate this with an example. Find an empirical rule 
for the best order. 

11. Approximations of it = 3.141 592 653 589 79 * • • 
are 22/7 and 355/113. Determine the corresponding 
errors and relative errors to 3 significant digits. 

12. Compute tt by Machin’s approximation 
16arctan (1/5) - 4 arctan (1/239) to 10S (which are 
correct). (In 1986, D. H. Bailey computed almost 
30 million decimals of 7r on a CRAY-2 in less than 
30 hours. The race for more and more decimals is 
continuing.) 

13. (Rounding and adding) Let a l9 • • • , a n be numbers 
with oj correcdy rounded to Dj decimals. In calculating 


the sum + • • • + a ni retaining D = min D$ 
decimals, is it essential that we first add and then round 
the result or that we first round each number to D 
decimals and then add? 

14. (Theorems on errors) Prove Theorem 1(a) for 
addition. 

15. Prove Theorem 1(b) for division. 

16. Show that in Example l the absolute value of the error 
of a 2 = 2.000/39.95 = 0.05006 is less than 0.00001. 

17. Overflow and underflow can sometimes be avoided 
by simple changes in a formula. Explain this in terms 
of Va 2 4 y 2 = aV 1 + (yl a) 2 with a 2 ^ y 2 and a so 
large that a 2 would cause overflow. Invent examples 
of your own. 

18. (Nested form) Evaluate 

/(a) = a 3 - 7.5a 2 4- 1 1.2* + 2.8 
= ((a-7.5)a+ U.2)a+ 2.8 
at x = 3.94 using 3S arithmetic and rounding, in both 
of the given forms. The latter, called the nested form, 
is usually preferable since it minimizes the number of 
operations and thus the effect of rounding. 

19. CAS EXPERIMENT. Chopping and Rounding. 

(a) Let a = 4/7 and v = 1/3. Find the errors e chop , e round 
and the relative errors e r>ch , € l%r<1 of a + v, a — v, xy, a ly 
in chopping and rounding to 5S. Experiment with other 
fractions of your choice. 

(b) Graph e chop and e round (for 5S) of k/2\ as a 
function of k = 1, 2, • • • , 21 on common axes. What 
average value can you read from the graph for e chop ? 
For e r0 und? Experiment with other integers that give 
similar graphs. Different types of graphs. Can you 
characterize the different types in terms of prime 
factors? 

(c) How does the situation in (b) change if you take 
4S instead of 5S? 

(d) Write programs for the work in (a)-(c). 

20. WRITING PROJECT. Numerics. In your own words 
write about the overall role of numeric methods in 
applied mathematics, why they are important, where 
and when they must be used or can be used, and how 
they are influenced by the use of the computer in 
engineering and other work. 
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19J Solution of Equations by Iteration 

From here on, each section will be devoted to some basic kind of problem and 
corresponding solution methods. We begin with methods of finding solutions of a single 
equation 

(1) f(x) = 0 

where / is a given function. For this task there are practically no formulas (except in a 
few simple cases), so that one depends almost entirely on numeric algorithms. A solution 
of (1) is a number a* = s such that f(s) = 0. Here, s suggests “solution,” but we shall also 
use other letters. 

Examples are a 3 + a = I , sin a = 0.5a, tan a = a, cosh a = sec a, cosh a cos a = — 1 , 
which can all be written in the form (1). The first concerns an algebraic equation because 
the corresponding / is a polynomial, and in this case the solutions are also called roots 
of the equation. The other equations are transcendental equations because they involve 
transcendental functions. Solving equations (1) is a task of prime importance because 
engineering applications abound: some occur in Chaps. 2, 4, 8 (characteristic equations), 
6 (partial fractions), 12 (eigenvalues, zeros of Bessel functions), and 16 (integration), but 
there are many, many others. 

To solve (1) when there is no formula for the exact solution, we can use an 
approximation method, in particular an iteration method, that is, a method in which we 
start from an initial guess a 0 (which may be poor) and compute step by step (in general 
better and better) approximations a x , a 2 , • • • of an unknown solution of (1). We discuss 
three such methods that are of particular practical importance and mention two others in 
the problem set. These methods and the underlying principles are basic for understanding 
the diverse methods in software packages. 

In general, iteration methods are easy to program because the computational operations 
are the same in each step — just the data change from step to step — and, more important, 
if in a concrete case a method converges, it is stable (see Sec. 19.1) in general. 

Fixed-Point Iteration for Solving Equations f(x) = 0 

Our present use of the word “fixed point” has absolutely nothing to do with that in the 
last section. 

In one way or another we transform (1) algebraically into the form 

(2) A- = g( a). 

Then we choose an a 0 and compute x x — g{ a 0 ), a 2 = gUi), and in general 

(3) A n+1 = g( a„) (n = 0, 1, • • •)• 

A solution of (2) is called a fixed point of g, motivating the name of the method. This is a 
solution of (1), since from a = g(.v) we can return to the original form /(a) = 0. From (1) 
we may get several different forms of (2). The behavior of corresponding iterative sequences 
-'o- -fit ’ ' * may differ, in particular, with respect to their speed of convergence. Indeed, some 
of them may not converge at all. Let us illustrate these facts with a simple example. 
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An Iteration Process (Fixed-Point Iteration) 

Set up an iteration process for the equation /(a) = .v 2 — 3.v +1=0. Since we know the solutions 
a=I.5±Vl 25, thus 2.618 034 and 0.381 966, 

we can watch the behavior of the error as the iteration proceeds. 

Solution . The equation may be written 

(4a) x = gi(x) = i(.v 2 + I), thus * n+1 = £(a w 2 + 1)* 

If we choose a 0 = 1 , we obtain the sequence (Fig. 423a; computed with 6S and then rounded) 

Xq = 1.000, x x = 0.667, x 2 = 0.481. a 3 = 0.41 K a 4 = 0.390, • • * 

which seems to approach the smaller solution. If we choose .v 0 = 2. the situation is similar. If we choose 
.v 0 = 3, we obtain the sequence (Fig. 423a, upper part) 


.v 0 = 3.000, x x = 3.333, a 2 = 4.037, a 3 = 5.766, a 4 = 1 1.415, • • • 

which diverges. 

Our equation may also be written (divide by x) 

1 1 

(4b) x = g 2 {x) = 3 7 , thus A n+1 =3 , 

x x n 

and if we choose .v 0 = 1 , we obtain the sequence (Fig. 423b) 

A ‘ 0 = 1.000, .Yi = 2.000, x 2 = 2.500, .y 3 = 2.600, .v 4 = 2.615, * - • 

which seems to approach the larger solution. Similarly, if we choose a 0 = 3, we obtain the sequence (Fig. 423 b) 

*0 = 3.000, Ax = 2.667, a 2 = 2.625, a 3 = 2.619, * 4 = 2.618, ••• . 

Our figures show the following. In the lower part of Fig. 423a the slope of g x (x) is less than the slope of 
y = .v, which is 1, thus \g[(x)\ < l. and we seem to have convergence. In the upper part. gjCv) is steeper 
(g{(A) > 1) and we have divergence. In Fig. 423b the slope of g 2 (x) is less near the intersection point (a* = 2.618, 
fixed point of g 2 , solution of f(x) = 0), and both sequences seem to converge. From all this we conclude that 
convergence seems to depend on the fact that in a neighborhood of a solution the curve of g(x) is less steep 
than the straight liney = a, and we shall now see that this condition |g # (A)| < 1 (= slope of y = a) is sufficient 
for convergence. M 




Fig. 423. Example 1, iterations (4a) and (4b) 
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An iteration process defined by (3) is called convergent for an x 0 if the corresponding 
sequence x 0 , jc x , • • • is convergent. 

A sufficient condition for convergence is given in the following theorem, which has 
various practical applications. 


Convergence of Fixed-Point Iteration 

Let x = s be a solution ofx = x) and suppose that g has a continuous derivative 
in some interval J containing s. Then if\g\x)\ ^ K < 1 in J, the iteration process 
defined by (3) converges for any x 0 in J, and the limit of the sequence {x n } is s. 


By the mean value theorem of differential calculus there is a t between x and s such that 

g(x) ~ g(s) = g'(t) (x - s) (x in 7). 

Since g(s) = s and = gCr 0 ), x 2 = g(x i), • • • , we obtain from this and the condition on 
|g'(;t)| in the theorem 

k - *1 = Isk-i) - g(s)l = k(0lk-i - *1 = ^k-i - 4 

Applying this inequality n times, for n f n — 1 , • • • , 1 gives 

k - *1 = Kk-i ~ *\ = ^k-2 -•*! = ••• = * n k - 4 

Since K < 1, we have K n — > 0; hence (x* - s\ — » 0 as n — > ■ 

We mention that a function g satisfying the condition in Theorem 1 is called a contraction 
because |g(x) — g(u)| ^ K\x — u|, where K < 1. Furthermore, K gives information on the 
speed of convergence. For instance, if K = 0.5, then the accuracy increases by at least 
2 digits in only 7 steps because 0.5 7 < 0.01. 

An Iteration Process. Illustration of Theorem 1 

Find a solution of /(*) = x 3 4- a* - 1 = 0 by iteration. 

Solution . A sketch shows that a solution lies near * = 1. We may write the equation as ( x 2 4- 1 )jc = 1 or 

* i ■ , . M 

X = gi(x) = 2 . so that x n+1 = — — 2 • Also \gi(x)\ = 2,2 < 1 

l • X i • Xft ) 

for any * because 4 a* 2 /(1 + .v 2 ) 4 = 4 .y 2 /(1 + Ax 2 + ••■)< 1, so that by Theorem 1 we have convergence for 
any .v 0 . Choosing xq = I , we obtain (Fig. 424 on p. 790) 

x x = 0.500, x 2 = 0.800, x 3 = 0.610, .v 4 = 0.729, .v 5 = 0.653, a* 6 = 0.701, 

The solution exact to 6D is s = 0.682 328. 

The given equation may also be written 

X = S 2 M = 1 - * 3 - Then fegWl = 3* 2 

and this is greater than 1 near the solution, so that we cannot apply Theorem i and assert convergence. Try 
xq = 1, *0 = 0.5. xq — 2 and see what happens. 

The example shows that the transformation of a given f(x ) = 0 into the form * = g(x) with g satisfying 
^ K < 1 may need some experimentation. gj 
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Newton's Method for Solving Equations f(x) = 0 

Newton’s method, also known as Newton-Raphson’s method , 1 is another iteration 
method for solving equations /(a) = 0, where f is assumed to have a continuous derivative 
The method is commonly used because of its simplicity and great speed. The underlying 
idea is that we approximate the graph of / by suitable tangents. Using an approximate 
value A' 0 obtained from the graph of /, we let a x be the point of intersection of the A-axis 
and the tangent to the curve of / at a 0 (see Fig. 425). Then 


tan = /'( a 0 ) = 


/(* o) 

A 0 « Aj 


hence 


Ax = A' 0 - 


f(Xp) 

/Vo) * 


In the second step we compute a 2 = a x — /fo)// Vi), in the third step a 3 from a 2 again 
by the same formula, and so on. We thus have the algorithm shown in Table 19.1. Formula 
(5) in this algorithm can also be obtained if we algebraically solve Taylor’s formula 


(5*) 


/(*n+l) ~ /C*n) + (*n+ 1 ~ W = 0. 



JOSEPH RAPHSON (1648-1715), English mathematician who published a method similar to Newton’s 
method. For historical details, see Ref. [GR2], p. 203. listed in App. 1. 
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Table 19.1 Newton's Method for Solving Equations f[x) = 0 


ALGORITHM NEWTON (/, x 0 , e, N) 

This algorithm computes a solution of f(x) = 0 given an initial approximation x 0 (starting 
value of the iteration). Here the function f(x) is continuous and has a continuous 
derivative 

INPUT: /, /\ initial approximation x 0 , tolerance e > 0, maximum number of 
iterations N . 

OUTPUT: Approximate solution x n (n ^ N) or message of failure. 

For n = 0, l, 2, • • • , N — 1 do: 

Compute /'(*»)• 

If f'(xj = 0 then OUTPUT “Failure”. Stop. 

[Procedure completed unsuccessfully] 

Else compute 


(5) 


■*71-1-1 -*7T 


f(x n ) 
f\x n ) ' 


If |.v n+1 - xj S e|,v n | then OUTPUT x n+1 . Stop. 

[Procedure completed successfully] 

End 

5 OUTPUT “Failure". Stop. 

[Procedure completed unsuccessfully after N iterations] 

End NEWTON 


If it happens that f'(x n ) = 0 for some n (see line 2 of the algorithm), then try another 
starting value x 0 . Line 3 is the heart of Newton’s method. 

The inequality in line 4 is a termination criterion. If the sequence of the x n converges 
and the criterion holds, we have reached the desired accuracy and stop. In this line the 
factor |jc n | is needed in the case of zeros of very small (or very large) absolute value 
because of the high density (or of the scarcity) of machine numbers for those x. 

WARNING ! The criterion by itself does not imply convergence. Example . The harmonic 
series diverges, although its partial sums x n = l/k satisfy the criterion because 
lim (x n +i ~ x n ) = lim (1 !{n + 1)) = 0. 

Line 5 gives another termination criterion and is needed because Newton’s method may 
diverge or, due to a poor choice of x 0 , may not reach the desired accuracy by a reasonable 
number of iterations. Then we may try another ^ 0 . If f(x) = 0 has more than one solution, 
different choices of x 0 may produce different solutions. Also, an iterative sequence may 
sometimes converge to a solution different from the expected one. 
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EXAMPLE 3 


EXAMPLE 4 


EXAMPLE 5 


Square Root 

Set up a Newton iteration for computing the square root .v of a given positive number c and apply it to c = 2. 
Solution . We have a* = Vc, hence fix) = a* 2 — c - 0. /'(.*) = 2a, and (5) takes the form 

2 . 


1 / ^ c \ 


For c = 2. choosing aq = 1 . we obtain 

A'x = 1.500 000. * 2 = 1.416 667, a 3 = 1.414216, a 4 = 1.414214, 

a *4 is exact to 6D. 


Iteration for a Transcendental Equation 

Find the positive solution of 2 sin x = .v. 

Solution . Setting /(a) = a — 2 sin.v, we have /'(a) =1—2 cos a, and (5) gives 

.v n - 2 sin x n _ 2(sin a w - a» cos a w ) _ N^_ 

A ‘n+1 A n j _ 2c0SA n 1 — 2cOSA n D n 

From the graph of / we conclude that the solution is near xq = 2. We compute: 


n 


N» 

D n 

*^n+l 

0 

2.00000 

3.48318 

1.83229 

1.90100 

1 

1.90100 

3.12470 

1.64847 

1.89552 

2 

1.89552 

3.10500 

1.63809 

1.89550 

3 

1.89550 

3.10493 

1.63806 

1.89549 


a 4 = 1.89549 is exact to 5D since the solution to 6D is 1.895 494. ■ 

Newton’s Method Applied to an Algebraic Equation 

Apply Newton’s method to the equation /(a) = a 3 + a — 1 = 0. 

Solution . From (5) we have 

_ x i? + X n - 1 _ 2v n 3 + 1 
• V,,+1 ■*" 3.v n 2 + I 3.v n 2 + 1 ' 

Starting from a 0 = 1, we obtain 

xi = 0.750 000. a 2 = 0.686 047, a 3 = 0.682 340, a 4 = 0.682 328, • • • 

where a 4 has the error —1*1 0“ 6 . A comparison with Example 2 shows that the present convergence is much 
more rapid. This may motivate the concept of the order of an iteration process, to be discussed next. I 

Order of an Iteration Method. Speed of Convergence 

The quality of an iteration method may be characterized by the speed of convergence, as 
follows. 

Let x n+1 = g(.r n ) define an iteration method, and let x n approximate a solution s of 
.v = g(x). Then x n = s — 6 n , where is the error of x n . Suppose that g is differentiable 
a number of times, so that the Taylor formula gives 

*«+l = g( x n) = g(s) + g'(s)(x n - S) + ig"(s)(x n ~ S ) 2 + ’ • • 

= g(s) - g'(s )e n + \g"(s)€n + • • • . 


( 6 ) 
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THEOREM 


The exponent of in the first nonvanishing term after g(s) is called the order of the 
iteration process defined by g. The order measures the speed of convergence. 

To see this, subtract g(s) = s on both sides of (6). Then on the left you get 
jc n+ 1 — s = — where is the error of x n+1 . And on the right the remaining 
expression equals approximately its first nonzero term because |ej is small in the case of 
convergence. Thus 

(a) » -fg'(s)€ n in the case of first order, 

(7) , „ 9 

(b) tn+i ~ 2 & (s) 6 * 2, in the case of second order, etc. 

Thus if €* = I0~ k in some step, then for second order, e n+1 = c • (lO - *) 2 = c • 10" 2fc , 
so that the number of significant digits is about doubled in each step. 


Convergence of Newton's Method 

In Newton’s method, g(x) = x — f(x)/f'(x). By differentiation, 

/'w 2 - /to/"to 

* « = 1 - 

(8) 1 W 

/to/"to 

/'to 2 ‘ 


Since /(.?) = 0, this shows that also g ' (s) = 0. Hence Newton’s method is at least of 
second order. If we differentiate again and set x = s, we find that 


( 8 *) 


/to = 


/"to 

/'to 


which will not be zero in general. This proves 


Second-Order Convergence of Newton’s Method 

If /to is three times differentiable and f' and f" are not zero at a solution s of 
/to = 0, then for x 0 sufficiently close to s, Newton ’s method is of second order. 


Comments. For Newton’s method, (7b) becomes, by (8*), 

/"to , 

(9) €re+1 ~ 2 /'(s) ‘ 

For the rapid convergence of the method indicated in Theorem 2 it is important that s be 
a simple zero of /to (thus /'to ^ 0) and that x 0 be close to s, because in Taylor’s formula 
we took only the linear term [see (5*)], assuming the quadratic term to be negligibly small. 
(With a bad x 0 the method may even diverge!) 
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EXAMPLE 6 


EXAMPLE 7 


Prior Error Estimate of the Number of Newton Iteration Steps 

Use x 0 - 2 and .Vj = 1.901 in Example 4 for estimating how many iteration steps we need to produce the 
solution to 5D accuracy. This is an a priori estimate or prior estimate because we can compute it after only 
one iteration, prior to further iterations. 

Solution. We have /(.*) = .v - 2 sin x = 0. Differentiation gives 

As) A*,) 2 sin .Vi 

— ; — * — 7 = ** 0.57. 

2 f'(s) 2U-2C0S.V!) 

Hence (9) gives 

|e,, +1 | = 0.57e,, 2 * 0.57(0.57e 2 -i) 2 = 0.57 3 ef,_, =»•••« 0.57 M eJ ,+1 s 5 - JO -6 

where M = 2" + 2" -1 + • • • + 2 + 1 = 2” + 1 - 1. We show below that e 0 ~ —0.1 1. Consequently, our 
condition becomes 

0.57 m 0.11 m+1 s 5 . io" 6 . 

Hence ;i = 2 is the smallest possible n. according to this crude estimate, in good agreement with Example 4. 

=* “0.11 is obtained from - e Q = (ct - s) - (e 0 - s) - -x x + .v 0 — 0.10, hence 
€ i = *b + 0.10 = — 0.57e o 2 or 0.57eo 2 + e 0 + 0.10 » 0, which gives e 0 = —0.1 1. ■ 

Difficulties in Newton’s Method. Difficulties may arise if \f\x)\ is very small near a 
solution s of fix) = 0, for instance, if s is a zero of /( a ) of second (or higher) order (so 
that Newton’s method converges only linearly, as an application of l’Hopitafs rule to 
(8) shows). Geometrically, small |/'(a)| means that the tangent of f(x ) near s almost 
coincides with the A-axis (so that double precision may be needed to get /( a ) and /'( a ) 
accurately enough). Then for values a = ? far away from s we can still have small function 
values 

R(s) = /(?). 

In this case we call the equation /(a) = 0 ill-conditioned. R(s) is called the residual of 
/(a) = 0 at s. Thus a small residual guarantees a small error of s only if the equation is 
not ill-conditioned. 


An Ill-Conditioned Equation 

fix) - x 5 + 10” 4 .v = 0 is ill-conditioned, a* = 0 is a solution. f'(0) = 10” 4 is small. At .v = 0.1 the residual 
/(0.1) = 2 • 10” 5 is small, blit the error — 0.1 is larger in absolute value by a factor 5000. Invent a more drastic 
example of your own. ■ 


Secant Method for Solving f(x) = 0 

Newton’s method is very powerful but has the disadvantage that the derivative /' may 
sometimes be a far more difficult expression than / itself and its evaluation therefore 
computationally expensive. This situation suggests the idea of replacing the derivative 
with the difference quotient 


/VJ - 


/(A n ) ~ /(An— i) 
A n A~ n _ i 


Then instead of (5) we have the formula of the popular secant method 
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EXAMPLE 8 



a«) aZ-IZ o- 

Geometrically, we intersect the .v-axis at x n+1 with the secant of fix) passing through 
P n _i and P n in Fig. 426. We need two starting values a 0 and av Evaluation of derivatives 
is now avoided. It can be shown that convergence is superlinear (that is, more rapid than 
linear, |e n+1 | « const • |ej 162 ; see [E5] in App. 1), almost quadratic like Newton’s method. 
The algorithm is similar to that of Newton’s method, as the student may show. 

CAUTION! It is not good to write (10) as 

l/Cjgt) X n fj\ n ^ l) 

A ” +1 " /(*„) - /(*„_!) ’ 

because this may lead to loss of significant digits if A n and x n ^ x are about equal. (Can 
you see this from the formula?) 

Secant Method 

Find the positive solution of /(a) = a* - 2 sin x = 0 by the secant method, starting from .vq ° 2, = 1 .9. 

Solution . Here, (10) is 


*n+l * 


(.v n 2 sin A' n )(.v n -Vn— l) 

x n - .v«_i + 2(sin ,v 7J _ 1 - sin.v n ) 


N n 


Numerical values are: 


n 

A n - 1 

An 

N n 

D„ 

A n+1 “ A^ 


2.000 000 

1.900 000 

-0.000 740 

-0.174005 

-0.004 253 


1.900 000 

1.895 747 

-0.000 002 

-0.006 986 

-0.000 252 


1.895 747 

1.895 494 

0 


0 


,v 3 = 1 .895 494 is exact to 6D. See Example 4. ■ 

Summary of Methods. The methods for computing solutions s of fix) = 0 with given 
continuous (or differentiable) f(x) start with an initial approximation x 0 of s and generate 
a sequence a* 1? a* 2 , • • • by iteration. Fixed point methods solve f(x) = 0 written as 
a* = g(x), so that s is a. fixed point of g, that is, s = g(s). For g(A*) = a* — f(x)!f\x) this 
is Newton’s method, which for good x 0 and simple zeros converges quadratically (and 
for multiple zeros linearly). From Newton’s method the secant method follows by 
replacing f'(x) by a difference quotient. The bisection method and the method of false 
position in Problem Set 19.2 always converge, but often slowly. 
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L -I LH OB LEM - SET 19. 3 = 


[T— 7] FIXED-POINT ITERATION 

Apply fixed-point iteration and answer related questions 
where indicated. Show details of your work. 

1. a* = 1 .4 sin .V, A*o = 1 .4 

2. Do the iterations indicated at the end of Example 2. 
Sketch a figure similar to Fig. 424. 

3. Why do we obtain a monotone sequence in Example 
1 . but not in Example 2? 

4. / = a* 4 - a* + 0.2 = 0, the root near 1, ,v 0 = 1 

5. / as in Prob. 4, the root near 0. ,v 0 = 0 

6. Find the smallest positive solution of sin a* = e ~ 0 5x , 

A'o = 1. 

7. (Bessel functions, drumhead) A partial sum of the 
Maclaurin series of 7 0 (.v) (Sec. 5.5) is 

/(a*) = l - ^a 2 4 ^.v 4 - e^a 6 . Conclude from a 
sketch that /(a:) = 0 near a* = 2. Write /(.v) = 0 as 
a* = g(x) (by dividing /(, „v) by J.v and taking the 
resulting A-term to the other side). Find the zero. (See 
Sec. 12.9 for the importance of these zeros.) 


8. CAS PROJECT. Fixed-Point Iteration, (a) Existence. 
Prove that if g is continuous in a closed interval / and its 
range lies in /, then the equation a* = g(x) has at least one 
solution in /. Illustrate that it may have more than one 
solution in /. 

(b) Convergence. Let /(a) = a* 3 + lv 2 - 3a -4 = 0. 
Write this as a = g(A), for g choosing ( 1 ) (a 3 - /) 1/3 , 
(2) (a 2 - if) 1 ' 2 . (3) * + if. (4) .v(l + I/), 
(5) (.v 3 - /)/ a 2 . (6) (2a 2 - /)/ 2a. (7) a - ///' 
and in each case .v 0 = 1 .5. Find out about convergence 
and divergence and the number of steps to reach exact 
6S-values of a root. 


9-18 


NEWTON’S METHOD 


Apply Newton’s method (6D accuracy). First sketch the 
function(s) to see what is going on. 

9. sin a = cot a. A’o = 1 

10. a = cos A, A 0 = I 

11. A 3 - 5 A + 3 = 0, A 0 = 2 

12. a 4* In a = 2, A 0 = 2 


13. (Vibrating beam) Find the solution of cos a* cosh a = 1 
near a = §tt. (This determines a frequency of a vibrating 
beam: see Problem Set 1 2.3.) 


14. (Heating, cooling) At what time a (4S-accuracy only) 
will the processes governed by /,(.v) = 100(1 - e~ 0 2x ) 
and f 2 (x) = 40e” aoix reach the same temperature? 
Also find the latter. 


15. (Associated Legendre functions) Find the smallest 
positive zero of 

/> 4 2 = (I - a 2 )/>4 = f (-7 a 4 + 8a 2 - 1) (Sec. 
5.3) (a) by Newton’s method, (b) exactly, by solving 
a quadratic equation. 

16. (Legendre polynomials) Find the largest root of the 
Legendre polynomial P 5 (a) given by 

P S ( V) = |(63a 5 - 70a 3 + 15a) (Sec. 5.3) (to be 
needed in Gauss integration in Sec. 19.5) (a) by 
Newton's method, (b) from a quadratic equation. 

17. Design a Newton iteration for cube roots and compute 
^7 (6D. a 0 = 2). 

18. Design a Newton iteration for Vc (c* > 0). Use it to 
compute V2. ^2. ^2, ^2 (6D. ,v 0 = I). 

19. TEAM PROJECT. Bisection Method. This simple 
but slowly convergent method for finding a solution of 
/(a) = 0 with continuous / is based on the 
intermediate value theorem, which states that if a 
continuous function / has opposite signs at some a = a 
and a = b (> a ), that is, either f(a) < 0, /(/;) > 0 
or f(a) > 0, f(b ) < 0, then f must be 0 somewhere 
on fr/, b). The solution is found by repeated bisection 
of the interval and in each iteration picking that half 
which also satisfies that sign condition. 

(a) Algorithm. Write an algorithm for the method. 

(b) Comparison. Solve a = cos a by Newton’s 
method and by bisection. Compare. 

(c) Solve e~ x = In a and e x 4 a 4 4 a = 2 by bisection. 

20. TEAM PROJECT. Method of False Position 
(Regula falsi). Figure 427 shows the idea. We assume 
that / is continuous. We compute the .v-intercept c 0 of 
the line through (a 0 , /(<* oh ( b 0 . f(b 0 )). If J(c 0 ) = 0, 
we are done. If f(a 0 )f(c 0 ) < 0 (as in Fig. 427), we 
set a i = a 0 , b x = c 0 and repeat to get c u etc. 
If fUto)f(c 0 ) > 0, then f(c Q )f(b 0 ) < 0 and we set 
a i = c 0 . b x = Z? 0 , etc. 

(a) Algorithm. Show that 

_ *o/(fro) ~ bo.f(«o) 
f(bo) ~ f(cto) 

and write an algorithm for the method. 

(b) Comparison. Solve a 3 = 5a + 6 by Newton's 
method, the secant method, and the method of false 
position. Compare. 

(c) Solve a 4 = 2, cos a = Va, and a + In a = 2 
by the method of false position. 
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21-24 


SECANT METHOD 


Solve, using a 0 and x x as indicated. 

21. Prob. 11, jc 0 = 0.5, x x = 2.0 

22. e~ x — tan x = 0, a * 0 = 1. x x = 0.7 

23. Prob. 9, a * 0 = 1, x x = 0.5 

24. Prob. 10, A- 0 = 0.5, x x = 1 


25. WRITING PROJECT. Solution of Equations. 

Compare the methods in this section and problem set, 
discussing advantages and disadvantages using 
examples of your own. 


19.2 Interpolation 

Interpolation means finding (approximate) values of a function f(x) for an x between 
different x- values x 0 , x x , • • • , a* w at which the values of f(x) are given. These values may 
come from a “mathematical” function, such as a logarithm or a Bessel function, or, perhaps 
more frequently, they may be measured or automatically recorded values of an “empirical” 
function, such as the air resistance of a car or an airplane at different speeds, or the yield 
of a chemical process at different temperatures, or the size of the U.S. population as it 
appears from censuses taken at 10-year intervals. We write these given values of a function 
/ in the form 

fo = f(x 0 ), /i = /to), • , f n = /(*») 

or as ordered pairs 

to» fo)* to.» /i)> * toi* fn )• 

A standard idea in interpolation now is to find a polynomial p n (x) of degree n (or less) 
that assumes the given values; thus 

(1) Pnbo) = fo* Pnto) = fl* • • • . Pn&n) = fiv 

We call this p n an interpolation polynomial and x 0 , • • • , x n the nodes. And if f(x) is a 
mathematical function, we call p n an approximation of / (or a polynomial 
approximation, because there are other kinds of approximations, as we shall see later). 
We use p n to get (approximate) values of / for x's between x 0 and a*„ (‘Interpolation”) 
or sometimes outside this interval x 0 ^ x ^ x n (“extrapolation”). 

Motivation. Polynomials are convenient to work with because we can readily 
differentiate and integrate them, again obtaining polynomials. Moreover, they approximate 
continuous functions with any desired accuracy. That is, for any continuous f(x) on an 
interval 7: a ^ a ^ b and error bound /? > 0, there is a polynomial p n (x ) (of sufficiently 
high degree n) such that 

l/M - p n (x)\ < p for all a: on J. 

This is the famous Weierstrass approximation theorem (for a proof see Ref. [GR7] 
p. 280; see App. 1). 
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EXAMPLE 1 


Existence and Uniqueness. p n satisfying ( 1 ) for given data exists — we give formulas 
for it below. p n is unique. Indeed, if another polynomial q n also satisfies 
q n C*o) = U ■ ■ ■ , tfn( x n) = fn’ then pjx) - q n (x) = 0 at x 0 , • • • , x n , but a polynomial 
Pn — Qn of degree n (or less) with n + 1 roots must be identically zero, as we know from 
algebra; thus p n (x ) = q n ( a*) for all a\ which means uniqueness. ■ 

How to Find /? n ? This is the important practical question. We answer it by explaining 
several standard methods. For given data, these methods give the same polynomial, by 
the uniqueness just proved (which is thus of practical interest!), but expressed in several 
forms suitable for different purposes. 


Lagrange Interpolation 

Given (.v 0 , jo), Ui, /i), * * * , (x n , f v ) with arbitrarily spaced Lagrange had the idea of 
multiplying each by a polynomial that is 1 at Xj and 0 at the other n nodes and then 
taking the sum of these n + 1 polynomials. Clearly, this gives the unique interpolation 
polynomial of degree n or less. Beginning with the simplest case, let us see how this works. 

Linear interpolation is interpolation by the straight line through (x 0 , / 0 ), (a*, / x ); see 
Fig. 428. Thus the linear Lagrange polynomial p y is a sum p x = L 0 f 0 + Lif x with L 0 the 
linear polynomial that is 1 at x 0 and 0 at x x ; similarly, L x is 0 at x 0 and 1 at x v Obviously, 


4 ( x) = 



L x (a) = 


X - X 0 
A‘i “ A*o ' 


Tliis gives the linear Lagrange polynomial 


(2) pi(x) = L 0 (x)f 0 + U x)h = A _ Xl • fo + X ° ' fv 

A'o X 1 X 1 A'o 



Linear Lagrange Interpolation 

Compute a 4D-value of In 9.2 from In 9.0 = 2.1972, In 9.5 = 2.2513 by linear Lagrange interpolation and 
determine the error, using In 9.2 = 2.2192 (4D). 

Solution. x 0 = 9.0. .\i = 9.5. f 0 = In 9.0. /j = In 9.5. In (2) we need 

Lo(x) = ~ ~ Q 9 f = -2.0(.v - 9.5), Z. 0 (9.2) = -2.0(-0.3) = 0.6 

x - 9.0 

Liix) = Q5 = 2.0 (.v - 9.0). Li(9.2) = 2 • 0.2 = 0.4 

(see Fig. 429) and obtain the answer 

In 9.2 ~/M9.2) = Lo(9.2)/ 0 + ^(9.2)/! = 0.6-2.1972 + 0.4-2.2513 = 2.2188. 
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EXAMPLE 2 


The error is e — a - a = 2.2192 — 2.2188 = 0.0004. Hence linear interpolation is not sufficient here to get 
4D-accuracy; it would suffice for 3D-accuracy. M 



Fig. 429. L 0 and in Example 1 


Quadratic interpolation is interpolation of given ( x 0 , / 0 ), (x x , / a ), (x 2> f 2 ) by a 
second-degree polynomial p 2 (x ), which by Lagrange’s idea is 


(3a) p 2 {x) = L 0 (x)f 0 + L 1 (x)f l + Li(x)f 2 


with L 0 (x 0 ) = 1, L^Xi) = L, L 2 (x 2 ) = 1, and Lq^) = L Q (x 2 ) = 0, etc. We claim that 


(3b) 


( _ lp(x) = (x - x x )(x - x 2 ) 

A Iq(x q ) (x 0 “ xt)(x 0 - x 2 ) 

/l(x) _ (x - x 0 )(x - x 2 ) 

1 /lC*l) (x 1 - X 0 )(X! - x 2 ) 

If A _ k(*) _ (X - X 0 )(x - Xx) 

A / 2 U 2 ) C*2 - A*o)(x 2 - Xj) * 


How did we get this? Well, the numerator makes L k (xj) = 0 if j ± k. And the denominator 
makes L k (x k ) = 1 because it equals the numerator at x = x k . 

Quadratic Lagrange Interpolation 

Compute In 9.2 by (3) from the data in Example 1 and the additional third value In 1 1.0 = 2.3979. 

Solution . In (3). 

(jc — 9.5)(jc — 11.0) 2 

W = ■ ( 9 F-9 l )g F= 1T 0 ) 2a5 * + 104 - 5 - L ° m) = °- 5400 ’ 

(x - 9.0)(x - 1 1.0) 1 , 

- (9 .5 ~ 9.0)(9.5 — ~ ll.0) = - - 2Q * + ^ 92 > = a4800 ' 

(.r — 9.0)(.v - 9.5) 1 0 

= O l-O ~ 9.0)( 7T .0^ 9 .5) = ~3 {x ~ 1S5x + 855) > ^ (92 > = -° 0200 ’ 

(see Fig. 430), so that (3a) gives, exact to 4D, 

In 9.2 = p 2 ( 9.2) = 0.5400 • 2.1972 + 0.4800 • 2.2513 - 0.0200 • 2.3979 = 2.2192. ■ 



Fig. 430. L 0 , L v L 2 in Example 2 
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THEOREM 1 


General Lagrange Interpolation Polynomial. For general n we obtain 


(4a) 


/to - Pn( x ) = 2 4to/k = 2 

fe =0 fc =0 


fk 


where L k { x k ) = 1 and L k is 0 at the other nodes, and the L k are independent of the function 
f to be interpolated. We get (4a) if we take 


loM = (x - x x )(x - x 2 ) • ■ • (.v - x n ) y 

(4b) / fc (x) = (a - x 0 ) • • • (x - **;_!)(* - x fc+1 ) • • • (a - x n ), 0 < k < n, 

/»(*) “ (* - A 0 )(a - Ai) • • • (A - A n _i). 


We can easily see that p n (x k ) = / te . Indeed, inspection of (4b) shows that l k (Xj) = 0 if 
j =£ k, so that for a — x ky the sum in (4a) reduces to the single term (l k (x k )/l k (x k ))f k = f k . 

Error Estimate. If / is itself a polynomial of degree n (or less), it must coincide with 
p n because the n 4- 1 data (a 0 , / 0 ), • • • , ( x n , f n ) determine a polynomial uniquely, so 
the error is zero. Now the special f has its (n 4* l)st derivative identically zero. This 
makes it plausible that for a general f its ( n + l)st derivative / <n+1) should measure the 
error 


= /(*) - Pn( A). 

It can be shown that this is true if / Cn+: l) exists and is continuous. Then, with a suitable 
t between x 0 and x w (or between a 0 , x w , and a if we extrapolate). 


(5) € n (x) = /(a) - p n ( a) = (a - a 0 )(a “ a x ) • • • (A - X n ) - 


Thus | € w (a)| is 0 at the nodes and small near them, because of continuity. The product 
(a — Xg) • • ■ ( a — x n ) is large for a away from the nodes. This makes extrapolation risky. 
And interpolation at an a will be best if we choose nodes on both sides of that a. Also* 
we get error bounds by taking the smallest and the largest value of f in+l \t) in (5) on the 
interval x 0 ^ ^ x n (or on the interval also containing a if we extrapolate). 

Most importantly, since p n is unique, as we have shown, we have 


Error of Interpolation 

Formula (5) gives the error for any polynomial interpolation method if fix) has a 
continuous (n + 1 )st derivative. 


Practical error estimate. If the derivative in (5) is difficult or impossible to obtain, apply 
the Error Principle (Sec. 19.1), that is, take another node and the Lagrange polynomial 
Pn+iix) and regard p n + i(x) — p n {x) as a (crude) error estimate for p n { x). 
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EXAMPLE 3 


Error Estimate (5) of Linear Interpolation. Damage by Roundoff. Error Principle 

Estimate the error in Example I first by (5) directly and then by the Error Principle (Sec. 19.1). 

Solution . (A) Estimation by (5). We have n = I. /(/) = In t. fit) = Mu fit) = — Mt 2 . Hence 

(-1) 0.03 

«=i(-v) = (x - 9.0)(a* - 9.5) ~~ 2 ~ ■ thus ei (9.2) = . 

t - 9.0 gives the maximum 0.03/9 2 = 0.00037 and t = 9.5 gives the minimum 0.03/9.5 2 = 0.00033, so that 
we get 0.00033 ^ ei(9.2) ^ 0.00037, or better, 0.00038 because 0.3/81 = 0.003 703 
But the error 0.0004 in Example 1 disagrees, and we can learn something! Repetition of the computation there 
with 5D instead of 4D gives 

In 9.2 « y?j(9.2) = 0.6 ■ 2. 19722 + 0.4 • 2.25129 = 2.21885 

with an actual error e = 2.21920 — 2.21885 = 0.00035, which lies nicely near the middle between our two 
error bounds. 

This shows that the discrepancy (0.0004 vs. 0.00035) was caused by rounding, which is not taken into account 
in (5). 

(B) Estimation by the Error Principle . We calculate /; 1 (9.2) = 2.21885 as before and then p 2 (9.2) as in 
Example 2 but with 5D, obtaining 

/? 2 (9.2) = 0.54-2.19722 + 0.48-2.25129 - 0.02*2.39790 = 2.21916. 

The difference /? 2 (9.2) — p x { 9.2) = 0.00031 is the approximate error of p x (9.2) that we wanted to obtain; this 
is an approximation of the actual error 0.00035 given above. H 

Newton's Divided Difference Interpolation 

For given data ( x 0 , / 0 ), • • • , (.v n , f n ) the interpolation polynomial p n (x) satisfying (1) is 
unique, as we have shown. But for different purposes we may use p n (x) in different forms. 
Lagrange’s form just discussed is useful for deriving formulas in numeric differentiation 
(approximation formulas for derivatives) and integration (Sec. 19.5). 

Practically more important are Newton’s forms of p n (x), which we shall also use for solving 
ODEs (in Sec. 21.2). They involve fewer arithmetic operations than Lagrange’s form. 
Moreover, it often happens that we have to increase the degree n to reach a required accuracy. 
Then in Newton’s forms we can use all the previous work and just add another term, a 
possibility without counterpart for Lagrange’s form. This also simplifies the application of 
the Error Principle (used in Example 3 for Lagrange). The details of these ideas are as follows. 

Let p n -i(x) be the (n — l)st Newton polynomial (whose form we shall determine); thus 
Ai-i(*o) = /o. Pn- i(*i) = fi, • • • , Pn-i(x n -i) = fn- 1 - Furthermore, let us write the nth 
Newton polynomial as 


(6) 

Pn(x) = Pn-l(x) + g n (x); 

hence 


(6') 

8n(x) = P„( X) ~ P n -l{x). 


Here g n (x) is to be determined so that p n (x 0 ) = / 0 , p n {x i) = • * • , p n (x n ) = f n . 

Since p n and p n-1 agree at * 0 , • • • , x n _ 1? we see that g n is zero there. Also, g n will 
generally be a polynomial of nth degree because so is p m whereas p n _j can be of degree 
n — 1 at most. Hence g n must be of the form 


( 6 ") 


On(x Xq)(x Xj^) ’ * ’ (a 
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PROOF 


We determine the constant a n . For this we set x = x n and solve (6”) algebraically for ci n . 
Replacing g n (jc w ) according to (6') and using p n (x n ) = f n9 we see that this gives 


( 7 ) 


_ fn Pn— lC*n) 

" (*n - X 0 )(x n -*!)•■• (A„ - A„_x) ‘ 


We write a k instead of a n and show that a k equals the kth divided difference, recursively 
denoted and defined as follows: 

, r , fi ~ fo 

fll = /[*o> *1] = 

X 1 *0 


and in general 


a 2 = /Uo, Xi, x 2 ] 


fix i, a 2 ] - f[x Q , jcJ 
x 2 ~ x 0 


( 8 ) 


= fix 0 , • • • , x k ] = 


fix* • • • . - f[Xo, • • • . *k-l] 


X k - Xq 


If n = 1, then p n -i( x n) = Pol x i) = fo because p 0 (x) is constant and equal to f 0 , the value 
of f(x) at Xq. Hence (7) gives 

fi ~ Po( x i) fi ~ fo rr , 

«i = = = f[Xo, Jfj, 

*1 - Xq X 1 - Xq 

and (6) and (6") give the Newton interpolation polynomial of the first degree 

Pi(x) = f 0 + (x- X 0 )f[X Q, Aj]. 

If n = 2, then this /?x and (7) give 


a 2 ~ 


fz Piix-z) 

(X 2 - X 0 )(x 2 - A x ) 


f 2 fo ( x 2 ~ A o)/[A-0, Xj] 
(x 2 - X 0 )(x 2 ~ A X ) 


/[a 0 , Xi, x 2 ] 


where the last equality follows by straightforward calculation and comparison with the 
definition of the right side. (Verify it; be patient.) From (6) and (6 W ) we thus obtain the 
second Newton polynomial 

p 2 (x) = fo + (x - Xo)f[x 0 , A'l] + (A - A 0 )(A - A!)/[A 0 , Aj, A 2 ]. 

For n = k, formula (6) gives 

(9) Pkix) - Pk-lix ) + (X ~ A 0 )(A - Ax) • • • (A - X k -i)f[XQ, • • • , aJ. 

With poix) = f 0 by repeated application with k = 1, • • • , n this finally gives Newton’s 
divided difference interpolation formula 


( 10 ) 


fix) ~ fo + (a - A 0 )/[A 0 , A'l] + (a - A 0 )(a ~ Ax)/[Ao, Aj, A 2 ] 
+ ••• + (*- A 0 )(A - Ax) ’ • ’ (A - A n _x)/[A 0 , • • • , A-J. 
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EXAMPLE 4 


An algorithm is shown in Table 19.2. The first do-loop computes the divided differences 
and the second the desired value p n (x). 

Example 4 shows how to arrange differences near the values from which they are 
obtained; the latter always stand a half-line above and a half-line below in the preceding 
column. Such an arrangement is called a (divided) difference table. ■ 


Table 19.2 Newton’s Divided Difference Interpolation 


ALGORITHM INTERPOL (* 0 , • * • , jc n ; / 0 , • • * , / n ; Jr) 
This algorithm computes an approximation p n (x) of f(x) at jc. 
INPUT: Data (. x 0 , / 0 ), (x lt / x ), • • • , (x n , f n ); x 


OUTPUT: Approximation p n (Jt) of f(jt) 

Set flxj] = fj (j = 0, • • • , n). 

For m = !,•••,/? — 1 do: 

For j = 0, • • • , n — m do: 

t _ ffcj+i* * 

f[Xj, i Xj +rn \ 


» Xj+m\ flty 


End 


End 


x j+m— 


ll 


Set p 0 (x) = f 0 . 

For k = 1, • • • , n do: 

PkW = Pk- it*) + C* “ *o) • • * (i - x h _ x )f[x 0 , • • • , x h ] 

End 

OUTPUT p n (jt) 

End INTERPOL 


Newton’s Divided Difference Interpolation Formula 

Compute /( 9.2) from the values shown in the first two columns of the following table. 


x, fj = f(Xj) 

f[xj, x j+ a ] f[ xj, x j+1 , x j+2 ] f[xj, • • • , x j+z ] 

8.0 (2.079 442) 

9.0 2.197 225 

9.5 2.251 292 

11.0 2.397 895 

(0.117 783) 

(-0.006433) 

0.108134 (0.000411) 

-0.005 200 

0.097 735 
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Solution . We compute the divided differences as shown. Sample computation: 
(0.097 735 - 0.108 134)/(1 1 - 9) = -0.005 200. 
The values we need in (10) are circled. We have 


/(•v) ~ p 3 (x) = 2.079 442 4- 0.117 783(.v - 8.0) - 0.006 433(a* - 8.0)(a - 9.0) 

-»- 0.00041 l(.v - 8.0 )(a - 9.0)(.x - 9.5). 


At a- = 9.2, 


/( 9.2) ~ 2.079 442 4- 0.141 340 - 0.001 544 - 0.000 030 = 2.219 208. 


The value exact to 6D is /(9.2) = In 9.2 = 2.219 203. Note that we can nicely see how the accuracy increases 
from term to term: 


Pi(9.2) = 2.220 782, p 2 {92) = 2.219 238. p 3 (9.2) = 2.219 208. ■ 

Equal Spacing: Newton’s Forward Difference Formula 

Newton’s formula (10) is valid for arbitrarily spaced nodes as they may occur in practice 
in experiments or observations. However, in many applications the a/s are regularly 
spaced — for instance, in measurements taken at regular intervals of time. Then, denoting 
the distance by /?, we can write 

(11) A 0 , A'! = A'o + h, x 2 = A 0 + 2h, • • • , x n = a 0 + nh. 

We show how (8) and (10) now simplify considerably! 

To get started, let us define the first forward difference of / at Xj by 

A/j = fj+i ~ fj, 

the second forward difference of f at Xj by 

A 2 fj = A/,- +1 - Afj, 

and, continuing in this way, the kt h forward difference of / at Xj by 

(12) A k fj = A k ~ 1 fj+i - A k ~ 1 f j (k = 1, 2, • • •)• 

Examples and an explanation of the name “forward” follow on p. 806. What is the point 
of this? We show that if we have regular spacing (11), then 

(13) /[*„,•••,**] = A k / 0 . 

We prove (13) by induction. It is true for k = 1 because = x 0 + h, so that 

/[ * 0 ’ ^ l] = = I (fl ~ /o) = 17 a a/ °- 
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Assuming (13) to be true for all forward differences of order k , we show that (13) holds 
for it + I . We use (8) with k -1- 1 instead of k; then we use ( k + 1 )h = x k+x - a* 0 , resulting 
from (11), and finally (12) with j = 0, that is, A k+1 /o = A fc /i “ A ,c / 0 . This gives 


/[-Vo, * * ‘ , Xk+li ~ 


/[-V 1 ? * * * , -V/c-f-l] flXo, > *V/c] 


(it + 1 )/l 


1 


(it + 1 ) /? 

1 

(k + l)!/? k+1 


.57 ^ - FF 4 ‘'°. 


A /o 


which is (13) with A' + 1 instead of k. Formula (13) is proved. 


In (10) we finally set .v = ,v 0 + rh. Then x — a 0 = rh, x — x x = (;• — l)/i since 
A‘x — Xq = h, and so on. With this and (13), formula (10) becomes Newton’s (or 
Gregoty 2 -Newton ’s) forward difference interpolation formula 


(14) 


m 



(a - = A'o + rh, r — ( x — x 0 )/h) 


— fo + '"A/o + 


r(r ~ 1) 

21 


A 2 /o + • • • + 


r(r — 1) • • • (r — n + 1) 
n! 


A n / 0 


where the binomial coefficients in the first line are defined by 

- 1)0- - 2) • • • (r - s + 1) 


< is > t) - (:) - - 


s\ 


(s > 0, integer) 


s. 


and si = l • 2 

Error. From (5) we get, with jc — jc 0 = rh, x — x x = (r — 1 )h , etc.. 


un+l 


(16) 


^(•v) = fix) - p n (x) = 


in + 1)! 


r(r - I ) • • • (r - n)f n+1 \t) 


with t as characterized in (5). 

Formula ( 1 6) is an exact formula for the error, but it involves the unknown t. In Example 
5 (below) we show how to use (16) for obtaining an error estimate and an interval in 
which the true value of /(jc) must lie. 

Comments on Accuracy. (A) The order of magnitude of the error ^(jc) is about equal 
to that of the next difference not used in p n (x). 

(B) One should choose .v 0 , • • • , jc„ such that the x at which one interpolates is as well 
centered between x 0 , • • • , x n as possible. 


2 JAMES GREGORY (1 638-1675). Scots mathematician, professor at St. Andrews and Edinburgh. A in (14) 
and V 2 (on p. 807) have nothing to do with the Laplacian. 
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EXAMPLE 5 


The reason for (A) is that in (16), 


r + \t) ~ 


A n+1 /(0 

h n+1 ’ 


Hr -!)•••(<•- «)1 g j if 
I • 2 • • • (n + 1) 


§1 


(and actually for any r as long as we do not extrapolate ). The reason for (B) is that 
| r(r — !)•••(/* — n ) | becomes smallest for that choice. 


Newton’s Forward Difference Formula. Error Estimation 

Compute cosh 0.56 from (14) and the four values in the following table and estimate the error. 


j 

Xj fj = cosh Xj 

A fi A % A 3 /,- 

0 

0.5 (1.127 626) 




(0.057 839) 

1 

0.6 1.185 465 

(0.011 865) 



0.069 704 (0.000 697) 

2 

0.7 1.255 169 

0.012 562 



0.082 266 

3 

0.8 1.337 435 



Solution . We compute the forward differences as shown in the table. The values we need are circled. In 
(14) we have r = (0.56 - 0.50)/0.1 = 0.6, so that (14) gives 

0.6 (—0.4) 0.6(— 0.4)(— 1.4) 

cosh 0.56 1.127 626 + 0.6 • 0.057 839 + 0.01 1 865 + t 0.000 697 

2 o 

= 1.127 626 + 0.034 703 - 0.001 424 + 0.000 039 
= 1.160 944. 

Error estimate . From (16), since the fourth derivative is cosh c4) t = cosh i, 

0.1 4 

e 3 (0.56) = — -0.6(-0.4)(-1.4)(-2.4)cosh/ 

= A cosh r, 

where A = -0.000 003 36 and 0.5 ^ t ^ 0.8. We do not know t. but we get an inequality by taking the largest 
and smallest cosh t in that interval: 


A cosh 0.8 ^ € 3 ( 0 . 62 ) ^ A cosh 0.5. 


Since 
this gives 

Numeric values are 


fix) = p 3 (x) + e 3 (.v), 

p 3 (0,56) 4- A cosh 0.8 ^ cosh 0.56 ^ /> 3 (0.56) 4 A cosh 0.5. 
1.160 939 ^ cosh 0.56 ^ 1.160 941. 


The exact 6 D- value is cosh 0.56 = 1.160 941. It lies within these bounds. Such bounds are not always so light. 
Also, we did not consider roundoff errors, which will depend on the number of operations. I 


This example also explains die name “ forward difference formula”: we see that the 
differences in the formula slope forward in the difference table. 
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EXAMPLE 6 


Equal Spacing: Newton’s Backward Difference Formula 

Instead of forward-sloping differences we may also employ backward-sloping differences. 
The difference table remains the same as before (same numbers, in the same positions), except 
for a veiy harmless change of the running subscript j (which we explain in Example 6, below). 
Nevertheless, purely for reasons of convenience it is standard to introduce a second name 
and notation for differences as follows. We define the first backward difference of / at Xj by 

V/j = fj - fj-1, 

the second backward difference of f at xj by 

V 2 /i = V/, - V£_ lt 

and, continuing in this way, the fcth backward difference of f at xj by 

(17) V% = V k ~ l fj - V fc -V,-i (k = 1,2, ■ ■ •)• 

A formula similar to (14) but involving backward differences is Newton’s (or 
Gregory-Newton’s) backward difference interpolation formula 


fix) 


( 18 ) 


” (r + s - 1\ 

= p n (x) = 2 ( I V*/ 0 (x = x 0 + rh, r = (x - x 0 )/h) 

3=0 ' S f 


= fo + rVfa + 




r(r -f 1) •••(/- 4- n - 1) 


n\ 


V”/ 0 . 


Newton’s Forward and Backward Interpolations 

Compute a 7D-value of the Bessel function J 0 {x) for a- = 1.72 from the four values in the following table, using 
(a) Newton’s forward formula (14), (b) Newton’s backward formula (18). 


jfor 

./back 

x j 

j oiXj) 

1st Diff. 

2nd Diff. 

3rd Diff. 

0 

-3 

1.7 

0.397 9849 

-0.057 9985 



1 

-2 

1.8 

0.339 9864 

-0.058 1678 

-0.000 1693 

0.0004093 

2 

-1 

1.9 

0.281 8186 

-0.057 9278 

0.000 2400 


3 

0 

2.0 

0.223 8908 





Solution . The computation of the differences is the same in both cases. Only their notation differs. 

(a) Forward. In (14) we have r = (1.72 - 1.70)/0.1 = 0.2, and j goes from 0 to 3 (see first column). In 
each column we need the first given number, and (14) thus gives 

Jo( 1-72) = 0.397 9849 + 0.2(-0.057 9985) + - 2( ~°' 8) (-0.000 1693) + ^ 2( ~ 0 ^ )( ~ L8 ) , 0 .0004093 

6 

= 0.397 9849 - 0.011 5997 + 0.000 0135 + 0.000 0196 = 0.386 4183, 
which is exact to 6D, the exact 7D-value being 0.386 4185. 
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(b) Backward. For (18) we use j shown in the second column, and in each column the last number. Since 
r = (1.72 — 2.00)/0. 1 = -2.8. we thus get from (18) 

— 2.8 ( — 1 . 8 ) - 2 . 8 (- 1 . 8 )(— 0 . 8 ) 

7 0 d .72) « 0.223 8908 - 2.8(-0.057 9278) + 0.000 2400 + 0.000 4093 

= 0.223 8908 + 0.162 1978 + 0.000 6048 - 0.000 2750 

= 0.386 4184. ■ 


Central Difference Notation 

This is a third notation for differences. The first central difference of /( x) at Xj is defined 
by 

8fj = fj+ 1/2 ~ fj- 1/2 


and the fcth central difference of /(, x) at Xj by 

( 19 ) 8 k fj = 1/2 - i /2 U = 2, 3, • ■ •). 

Thus in this notation a difference table, for example, for / 0 , f v f 2 * looks as follows: 
*-i 

-v 0 

- v i 
A*2 

Central differences are used in numeric differentiation (Sec. 19.5), differential equations 
(Chap. 21), and centered interpolation formulas (e.g., Everett’s formula in Team Project 
22). These are formulas that use function values “symmetrically” located on both sides 
of the interpolation point a*. Such values are available near the middle of a given table, 
where centered interpolation formulas tend to give better results than those of Newton’s 
formulas, which do not have that “symmetry” property. 


/-i 

f 0 

Jl 

f 2 


8f - 1/2 
8f 1/2 

Sf‘3/2 


8 1 2 f 0 

S 2 fx 


8 3 4 5 f 1/2 




1. (Linear interpolation) Calculate p x (x) iti Example 1. 
Compute from it In 9.4 » P\(9A). 

2. Estimate the error in Prob, 1 by (5). 

3. (Quadratic interpolation) Calculate the Lagrange 

polynomial p 2 { x) for the 4D-vaIues of the Gamma 
funcuon [(24), App. 3.1] T( 1 .00) = 1.0000, 

T(1.02) = 0.9888, T(1.04) = 0.9784, and from it 
approximations of TCv) for .v = 1.005. 1.010, 1.015, 
1.025, 1.030, 1.035. 

4. (Error bounds) Derive error bounds for p 2 (9.2) in 
Example 2 from (5). 

5. (Error function) Calculate the Lagrange polynomial 
P 2 (a*) for the 5D- values of the error function 


f(x) = erf.v = (2/Vtt) Jo e d\\\ namely, 
/(0.25) = 0.27633, /(0.5) = 0.52050, /( 1 ) = 0.84270, 
and from p 2 an approximation of /(0.75) (= 0.71 1 16, 
5D). 

6. Derive an error bound in Prob. 5 from (5). 

7. (Sine integral) Calculate the Lagrange polynomial 
p 2 (x) for the 4D-values of the sine integral Si(.v) [(40) 
in App. 3.1], namely, Si(0) = 0, Si(l) = 0.9461, 
Si(2) = 1.6054, and from p 2 approximations of Si(0.5) 
(= 0.4931, 4D) and Si(1.5) (= 1.3247, 4D). 

8. (Linear and quadratic interpolation) Find e“ 0,25 and 
e~ 0 75 by linear interpolation with ,v 0 = 0, x } = 0.5 and 
a 0 = 0.5, x 1 = 1, respectively. Then find p 2 (x) 
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interpolating e~ x with a 0 = 0, x l = 0.5, x 2 = 1 and 
from it e~ 0 25 and e~ o: 75 . Compare the errors of these 
linear and quadratic interpolations. Use 4D-values of e~ x . 

9. (Cubic Lagrange interpolation) Calculate and sketch 
or graph L 0 , L 1% L 2 , L 3 for x = 0, 1, 2, 3 on common 
axes. Find p 3 (.v) for the data 

( 0 , 1 ) 

(1,0.765198) 

(2, 0.223891) 

(3, -0.260052) 

[values of the Bessel function J 0 (.v)]. Find p 3 for 
x = 0.5, 1.5, 2.5 by interpolation. 

10. (Interpolation and extrapolation) Calculate p 2 {x) in 
Example 2. Compute from it approximations of In 9.4, 
In 10, In 10.5, In 11.5, In 12, compute the errors by 
using exact 4D-values, and comment. 

11. (Extrapolation) Does a sketch or graph of the product 
of the (x — .Vj) in (5) for the data in Prob. 10 indicate 
that extrapolation is likely to involve larger errors than 
interpolation does? 

12. (Lower degree) Find the degree of the interpolation 
polynomial for the data 

(-2. 33) 

(0, 5) 

(2, 9) 

(4, 45) 

(6. 113). 

13. (Newton’s forward difference formula) Set up (14) 
for the data in Prob. 7 and derive p 2 {x) from (14). 

14. Set up Newton’s forward difference formula for the 
data in Prob. 3 and compute r(l.Ol), T(1.03), T(1.05). 

15. (Newton’s divided difference formula) Compute 
/(0.8) and /(0.9) from 

/(0.5) = 0.479 

/( 1.0) = 0.841 

/(2.0) = 0.909 

by quadratic interpolation. 

16. Compute /(6.5) from 


17. (Central differences) Write the difference in the table 
in Example 5 in central difference notation. 

18. (Subtabulation) Compute the Bessel function J x (x) for 
x = 0. 1, 0.3, • • • , 0.9 from J^O) = 0, = 0.09950, 
^(0.4) = 0.19603,^(0.6) = 0.28670, ^(0.8) = 0.36884, 
7 X ( 1.0) = 0.44005. Use (14) with n = 5. 

19. (Notations) Compute a difference table of f(x) = x z 
for x = 0, 1, 2, 3, 4, 5. Choose .v 0 = 2 and write all 
occurring numbers in terms of the notations (a) for 
central differences, (b) for forward differences, (c) for 
backward differences. 

20. CAS EXPERIMENT. Adding Terms in Newton 
Formulas. Write a program for the forward formula 
(14). Experiment on the increase of accuracy by 
successively adding terms. As data use values of some 
function of your choice for which your CAS gives the 
values needed in determining errors. 

21. WRITING PROJECT. Interpolation: Comparison 
of Methods. Make a list of 5-6 ideas that you feel are 
most basic in this section. Arrange them in the best 
logical order. Discuss them in a 2-3 page report. 

22. TEAM PROJECT. Interpolation and Extrapo- 
lation. (a) Lagrange practical error estimate (after 
Theorem 1). Apply this to Pi(9.2) and p 2 (9.2) for the 
data a'o = 9.0, x x = 9.5, x 2 = 1 1.0, f 0 — In .v 0 , 
/i = In a*j , f 2 = In .v 2 (6S-values). 

(b) Extrapolation. Given (xj. f(xj)) = (0.2. 0.9980), 
(0.4, 0.9686), (0.6, 0.8443), (0.8, 0.5358), (1.0, 0). 
Find /(0.7) from the quadratic interpolation 
polynomials based on (a) 0.6, 0.8, 1.0, (/3) 0.4, 0.6, 
0.8, (y) 0.2, 0.4, 0.6. Compare the errors and comment. 
[Exact /(a) = cos (±ra 2 ), /(0.7) = 0.7181 (4S).] 

(c) Graph the product of factors (, x - xj) in the error 
formula (5) for n = 2, • * • , 10 separately. What do 
these graphs show regarding accuracy of interpolation 
and extrapolation? 

(d) Central differences. Show that 

S 2 /m = fm + 1 - 2 frn + fm-i* and, furthermore 
^ f m+ 1/2 — fin + 2 — ^im-U ^ fm ~~ fm— 1» 

5”f m = A"/ w _ n/2 = V*/m + n/2. 

(e) Everett’s interpolation formula 


( 20 ) 


f{x) - (1 - r)/ 0 + rf ± 

(2 - r)(l - r)(— r) 
" 3! 


d 2 f 0 


/(6.0) = 0.1506 
/(7.0) = 0.3001 
/( 7.5) = 0.2663 
/(7.7) = 0.2346 

by cubic interpolation, using (10). 




(r + Mr - I) 
3! 


8 2 fx 


is an example of a formula involving only even-order 
differences. Use it to compute the Bessel function J 0 (x) 
for x = 1.72 from y o (1.60) = 0.455 4022 and J 0 (L 7). 
7 0 (L8), / 0 (1.9) in Example 6. 
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19.4 Spline Interpolation 

Given data (function values, points in the Ay-plane) (a* 0 , / 0 ), (a* x , / x ), • • • , ( x n , f n ) can be 
interpolated by a polynomial P n (;t) of degree n or less so that the curve of P n (x ) passes 
through these n 4* 1 points (x j9 fj); here f 0 = /(a' 0 ), • • • , f n = f(x n ). See Sec. 19.3. 

Now if n is large, there may be trouble: P n (x ) may tend to oscillate for x between the 
nodes a* 0 , ■ • • , x n . Hence we must be prepared for numeric instability (Sec. 19. 1 ). Figure 
43 1 shows a famous example by C. Runge 3 for which the maximum error even approaches 
oe as n — > (with the nodes kept equidistant and their number increased). Figure 432 

illustrates the increase of the oscillation with n for some other function that is piecewise 
linear. 

Those undesirable oscillations are avoided by the method of splines initiated by 
I. J. Schoenberg in 1946 ( Quarterly of Applied Mathematics 4, pp. 45-99, 112-141). This 
method is widely used in practice. It also laid the foundation for much of modern CAD 
(computer-aided design). Its name is borrowed from a draftman*s spline , which is an 
elastic rod bent to pass through given points and held in place by weights. The 
mathematical idea of the method is as follows: 

Instead of using a single high-degree polynomial P n over the entire interval a^kx^kb 
in which the nodes lie, that is, 

(l) a = x 0 < x t < - - - < x n = b, 

we use n low-degree, e.g., cubic, polynomials 

q Q (x\ c/ x U), • ■ • , 

one over each subinterval between adjacent nodes, hence q Q from x 0 to x x , then q x from 




X 


Fig. 431. Runge's example /(x) = 1/(1 + x 2 ) and interpolating polynomial P 10 (x) 



Fig. 432. Piecewise linear function /(x) and interpolation polynomials of increasing degrees 


CARL RUNGE (1856-1927). German mathematician, also known for his work on ODEs (Sec. 21.1). 
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THEOREM 1 


PROOF 


Xi to x 2 , and so on. From this we compose an interpolation function g(x), called a spline, 
by Fitting these polynomials together into a single continuous curve passing through the 
data points, that is, 

(2) g(x o) = fix o) = U StVl) = f(x i) = /i, • • • , gW = /(*n) = /n. 

Note that g(A*) = # 0 (x) when ,v 0 ^ a* = a x , then g(x) = qi(x) when x 1 ^ x ^ a* 2 , and so 
on. according to our construction of g . 

Thus spline interpolation is piecewise polynomial interpolation . 

The simplest g/s would be linear polynomials. However, the curve of a piecewise linear 
continuous function has corners and would be of little interest in general — think of 
designing the body of a car or a ship. 

We shall consider cubic splines because these are the most important ones in 
applications. By definition, a cubic spline g(x) interpolating given data (x 0 , / 0 ), • • • , 
( x n , /*) is a continuous function on the interval a = a * 0 = a ' ^ x n = b that has continuous 
first and second derivatives and satisfies the interpolation condition (2); furthermore, 
between adjacent nodes, g( x) is given by a polynomial qj(x) of degree 3 or less. 

We claim that there is such a cubic spline. And if in addition to (2) we also require that 

(3) £'(a-o) = *o> s'M = K 

(given tangent directions of g(x) at the two endpoints of the interval a ^ x b ), then we 

have a uniquely determined cubic spline. This is the content of the following existence 
and uniqueness theorem, whose proof will also suggest the actual determination of splines. 
(Condition (3) will be discussed after the proof.) 


Existence and Uniqueness of Cubic Splines 

Let (x 0 , / 0 ), (a'i, / x ), • • • , (a w , /„) with arbitrarily spaced given Xj [see ( 1 )] and given 
fj = f( x j)*j = 0, I ,*••,/?. Let k 0 and k n be any given numbers . Then there is one 
and only one cubic spline g(x) corresponding to ( 1) and satisfying (2) and (3). 


By definition, on every subinterval Ij given by X, ^ a* ^ Xj+i the spline g(x) must agree 
with a polynomial qj(x) of degree not exceeding 3 such that 

(4) f(. x j+ x) 0* o, l, * * * , n l). 

For the derivatives we write 


(5) q'ixj) = kj , 


tfj&j+i) kj + 1 


O’ = 0, 1 , • • • , n - 1) 


with k 0 and k n given and k l9 • • • , k n _ x to be determined later. Equations (4) and (5) are 
four conditions for each qj(x). By direct calculation, using the notation 


(6*) 



1 


■9+1 “ X 3 


{j = 0, 1, • • • , n - 1) 


we can verify that the unique cubic polynomial q 5 {x) (j = 0, 1. • • • , /i - I) satisfying 
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(4) and (5) is 

<lj(x) = /(Aj)c/(x - X( +1 ) 2 [l + 2 Cj(x - Aj)] 

+ f(X j+1 )Cj 2 (x - Aj) 2 [l - 2 Cj(x - A j+1 )] 

( 6 ) 

+ kj-c/(x - xj)(x - x j+1 ) 2 

+ *j+ lCj 2 (X - Xjf{X - x j+1 ). 

Differentiating twice, we obtain 

(7) q'-ixj) = -6c//(Xj) + 6cj 2 f(x j+l ) - 4 Cjkj - 2cjk j+1 

(8) = 6 cffOCj) - 6 Cj 2 f(x j+1 ) + 2cjkj + 4c^ +1 . 

By definition, g(A*) has continuous second derivatives. This gives the conditions 

q'U = (y = i, • * • , /I - i). 

If we use (8) with 7 replaced by 7 — l, and (7), these /1 — 1 equations become 

(9) + 2 (<:.,•_! + c,)*, + c^ +1 = 3[cf_iV/.,- + c/V/ i+1 ] 

where = f(xj) - /(xj^) and Vf j+1 = f(x j+1 ) - f(xj) and j - !,•••,«- 1, as 
before. This linear system of n — 1 equations has a unique solution k x , • • • , k n _ x since 
the coefficient matrix is strictly diagonally dominant (that is, in each row the (positive) 
diagonal entry is greater than the sum of the other (positive) entries). Hence the determinant 
of the matrix cannot be zero (as follows from Theorem 3 in Sec. 20.7), so that we may 
determine unique values k l9 • • • , k n _ 1 of the first derivative of g(x) at the nodes. This 
proves the theorem. ■ 

Storage and Time Demands in solving (9) are modest, since the matrix of (9) is sparse 
(has few nonzero entries) and tridiagonal (may have nonzero entries only on the diagonal 
and on the two adjacent “parallels” above and below it). Pivoting (Sec. 7.3) is not necessary 
because of that dominance. This makes splines efficient in solving large problems with 
thousands of nodes or more. For some literature and some critical comments, s tt American 
Mathematical Monthly 105 (1998), 929-941. 

Condition (3) includes the clamped conditions 

(10) g'(x 0 ) = /Vo), g\x n ) = f\x n ), 

in which the tangent directions / Vo) and f'(x n ) at the ends are given. Other conditions 
of practical interest are the free or natural conditions 

( 11 ) 8 Vo) = 0, g"(x n ) = 0 

(geometrically: zero curvature at the ends, as for the draftman’s spline), giving a natural 
spline. These names are motivated by Fig. 290 in Problem Set 12.3. 
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EXAMPLE 1 


Determination of Splines* Let k 0 and k n be given. Obtain k l9 • • • , ^_x by solving the 
linear system (9). Recall that the spline g(*) to be found consists of n cubic polynomials 
q 0 , • • • , q n -i- We write these polynomials in the form 

( 12 ) q 3 (x) = cijo + a n (x - Xj) + a j2 (x - Xjf + a j3 (x - Xj) 3 
where j = 0, * • • , n - 1. Using Taylor’s formula, we obtain 

= fj 

ciji = 4j(xj) = 

( 13 ) a j2 = j q'j(xj) = |r (f j+1 - fj) - j (k j+1 + 2 kj) 

a j3 ~ "g" Q j( x j) ~ 1^3 ( fj ~ fj+ 1) j^2 (kj+i + kj) 

with Oj S obtained by calculating q”{Xj +l ) from (12) and equating the result to (8), that is, 

„ 6 2 
«jC%+ 1) = 2a j 2 + 6a j3 h = -j-£ ( fj - f j+1 ) + — (kj + 2k j+ i), 

and now subtracting from this 2aj 2 as given in (13) and simplifying. 

Note that for equidistant nodes of distance hj = h we can write Cj = c = I //2 in (6*) 
and have from (9) simply 

3 

( 14 ) kj-i + 4 kj + k j+1 = — ( f j+l - fj-x) (j = l, •••, n — 1 ). 


by (2), 
by (5), 

by (7), 


Spline Interpolation. Equidistant Nodes 

Interpolate f{x) = .v 4 on the interval —1 I by the cubic spline g(x) corresponding to the nodes 

a'o = -1. A'i = 0, a ’2 = 1 and satisfying the clamped conditions l) = /*(— 1). g'O) = / y (l). 

Solution . In our standard notation the given data are f 0 = /(-l) = I, f\— /( 0) = 0, f 2 = /(I) = 1. 

We have h = 1 and n = 2, so that our spline consists of n = 2 polynomials 

?0<-v) = «00 + «oi(v + 1) + a 0 2(-'- + I) 2 + «osU' + l) 3 (-1 S.V 0), 

= tiio + a u x + a 13 x z + n 13 .v 3 (0 S .v S I). 

We determine the kj from (14) (equidistance!) and then the coefficients of the spline from (13). Since n = 2, 
the system (14) is a single equation (with j = 1 and h = 1) 

k 0 + Ak x + k 2 = 3(/ 2 - J 0 ). 

Here f 0 = f 2 = I (the value of .v 4 at the ends) and k 0 = “4, k 2 — 4, the values of the derivative 4a* 3 at the 
ends -1 and 1. Hence 


-4 + 4*j + 4*3(1 - l) = 0. = 0. 

From (13) we can now obtain the coefficients of q Q , namely, a 00 = / 0 = 1. a 01 = k 0 = -4, and 
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EXAMPLE 2 


"02 « -jfftfi - fo) ~ }<*i + 2A- 0 ) = 3(0 - 1) - (0 - 8) = 5 

"os = -Jjtfo - fi) + Jz(h + k 0 ) = 2(1 - 0) + (0 - 4) = -2. 

Similarly, for the coefficients of we obtain from (13) the values <? 10 = f\ = 0, = ky = 0. and 

«12 = 3 (/ 2 - h) ~ (k 2 4 - 2* x ) = 3(1 - 0) - (4 + 0) = -1 

*13 = 2 (/! - / 2 ) + (* 2 + k x ) = 2(0 - 1) + (4 + 0) = 2. 

This gives the polynomials of which the spline g(x) consists, namely, 

r</ 0 (A) = 1 - 4(.v + 1) + 5(.v + l) 2 - 2(.v + l) 3 = -v 2 - 2 a 3 if - 1 s , v s o 
gW = , , 

UxU) = -.v 2 + 2 a 3 if 0 § jc § 1. 

Figure 433 shows f(x) and this spline. Do you see that we could have saved over half of our work by using 
symmetry? M 



Fig. 433. Function f(x) = x 4 and cubic spline g(x) in Example 1 

Natural Spline. Arbitrarily Spaced Nodes 

Find a spline approximation and a polynomial approximation for the curve of the cross section of the circular- 
shaped Shrine of the Book in Jerusalem shown in Fig. 434. 



Fig. 434. Shrine of the Book in Jerusalem (Architects F. Kissler and A. M. Bartus) 



SEC 19.4 Spline Interpolation 


815 


Solution . Thirteen points, about equally distributed along the contour (not along the A-axis!), give these data: 

Xj -5.8 -5.0 -4.0 -2.5 -1.5 -0.8 0 0.8 1.5 2.5 4.0 5.0 5.8 

fj 0 1.5 1.8 2.2 2.7 3.5 3.9 3.5 2.7 2.2 1.8 1.5 0 

The figure shows the corresponding Interpolation polynomial of 12th degree, which is useless because of its 
oscillation. (Because of roundoff your software will also give you small error terms involving odd powers of 
x.) The polynomial is 

p 12 (x) = 3.9000 - 0.65083a 2 + 0.033858a' 4 + 0.0J 1041a- 6 - 0.0014010a 8 
+ 0.000055595a 16 - 0.00000071867a 12 . 

The spline follows practically the contour of the roof, with a small error near the nodes -0.8 and 0.8. The spline 
is symmetric. Its six polynomials corresponding to positive .v have the following coefficients of their 
representations (12). (Note well that (12) is in terms of powers of a* — .v 7 -, not a!) 


j 

x-interval 

a j0 


a j2 

a j3 

0 

00 

q 

o 

d 

3.9 

0.00 

-0.61 

-0.015 

1 

0.8. ..1.5 

3.5 

-1.01 

-0.65 

0.66 

2 

1. 5. ..2.5 

2.7 

-0.95 

0.73 

-0.27 

3 

2.5...4.0 

2.2 

-0.32 

-0.091 

0.084 

4 

4.0.. .5.0 

1.8 

-0.027 

0.29 

-0.56 

5 

5.0... 5. 8 

1.5 

-1.13 

-1.39 

0.58 




1. WRITING PROJECT. Splines. In your own words, 
and using as few formulas as possible, write a short 
report on spline interpolation, its motivation, a 
comparison with polynomial interpolation, and its 
applications. 

2. (Individual polynomial qj) Show that qj(x) in (6) 
satisfies the interpolation condition (4) as well as the 
derivative condition (5). 

3. Verify the differentiations that give (7) and (8) from 

( 6 ). 

4. (System for derivatives) Derive the basic linear 
system (9) for • • • , x as indicated in the text. 

5. (Equidistant nodes) Derive (14) from (9). 

6. (Coefficients) Give the details of the derivation of cij 2 
and aj 3 in (13). 

7. Verify the computations in Example 1 . 

8. (Comparison) Compare the spline g in Example 1 with 
the quadratic interpolation polynomial over the whole 
interval. Find the maximum deviations of g and p 2 from 
/. Comment. 

9. (Natural spline condition) Using the given 
coefficients, verify that the spline in Example 2 
satisfies g'\x) = 0 at the ends. 


1 10—16 [ DETERMINATION OF SPUNES 

Find the cubic spline g{, x) for the given data with k 0 and 

as given. 

10. /(— 2) = /(-l) = /(l) = /( 2) = 0 , f(0) = 1 , 
k 0 = k 4 — 0 

11. If we started from the piecewise linear function in 
Fig. 435, we would obtain g(x) in Prob. 10 as the spline 
satisfying g'(-2) = /'(- 2) = 0, g\ 2) = /'( 2) = 0. 
Find and sketch or graph the corresponding 
interpolation polynomial of 4th degree and compare it 
with the spline. Comment. 



Fig. 435. Spline and interpolation 
polynomial in Problems 10 and 11 


12- fo = m = 1, ft = f( 2) = 9 ,f 2 = /( 4) = 41, 
h = m = 41, k 0 = 0, /fc 3 = -12 
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13. / o = /(“I) = 0. / ! = m = 4, f 2 = /( 1) = 0, 
k 0 = 0, k 2 = 0. Is g( x) even? (Give reason.) 

14. h = /( 0) = 0, / a = /(l) = 1, f 2 = f( 2) = 6, 
/ 3 = /(3) = 10, k 0 = 0, k 3 = 0 

15. / 0 = f(0) = 1. f x = /(l) = 0, / 2 = /( 2) = - 1, 
/ 3 = /(3) = 0, A' 0 = 0. k z = -6 

16. It can happen that a spline is given by the same 
polynomial in two adjacent subintervals. To illustrate 
this, find the cubic spline g( a) for /(a) = sin a 
corresponding to the partition a 0 = — tt/ 2, jt t = 0, 
a 2 = tt/2 of the interval —tt/2 ^ a ^ tt/2 and 
satisfying g'( — 77/2) = /'( — 77/2) and 

g'(77/2) = /'(tt/2). 

17. (Natural conditions) Explain the remark after (11). 

18. CAS EXPERIMENT. Spline versus Polynomial. If 
your CAS gives natural splines, find the natural splines 
when a is integer from -/// to m, and y( 0) = 1 and all 
other y equal to 0. Graph each such spline along with 
the interpolation polynomial p 2m . Do this for m — 2 to 
10 (or more). What happens with increasing ///? 

19. If a cubic spline is three times continuously differentiable 
(that is, it has continuous fust, second, and third 
derivatives), show that it must be a single polynomial. 

20. TEAM PROJECT. Hermite Interpolation and 
Bezier Curves. In Hermite interpolation we are 
looking for a polynomial p( x) (of degree 2 n + 1 or less) 
such that p(x) and its derivative p\x) have given values 
at;/ + 1 nodes. (More generally, p(x), p'(x), p'\x), * * 4 
may be required to have given values at the nodes.) 

(a) Curves with given endpoints and tangents. Let 
C be a curve in the .vy-plane parametrically represented 
by r(/) = [a(0, >•(/)], 0 ^ t ^ 1 (see Sec. 9.5). Show 
that for given initial and terminal points of a curve and 
given initial and terminal tangents, say, 

A: r 0 = [a( 0), y(0)] 

= [*o» Vo], 

B: r x = [A(l),y(l)] 

= [A'l, yi] 

Vo = [.v'(0), >-'(0)] 

= [.vo. yi]. 

v, = [jr'(l),/(D] 

= [•']. y[] 
we can find a curve C, namely, 
r(t) = r 0 + v 0 / 

+ (3(r, - r 0 ) - (2v 0 + v,))* 2 
+ (2(r 0 - rj) + v 0 + Vj)/ 3 : 


in components, 

■v it ) = a-o + xof + (3(ax - a-o) - (2a-; + x [)) t 2 
+ (2(a- 0 - -v,) + A-; + A-[)r 3 

y(t) = 3’o + y'o' + (3(.Vt - v 0 ) - (2 y' 0 + y[))t 2 
+ (2(y 0 - yi) +.)>; + y{>,». 

Note that this is a cubic Hermite interpolation 
polynomial, and n = I because we have two nodes (the 
endpoints of C). (This has nothing to do with the 
Hermite polynomials in Sec. 5.8.) The two points 

0 A : go = lo + v 0 

= [a * 0 + a-;, vo + y;] 
and 

G b ■ gi = r, - V! 

= [a'i — a* j, y i — y x] 

are called guidepoints because the segments AG A and 
BG b specify the tangents graphically. A, G A , G B 
determine C, and C can be changed quickly by moving 
the points. A curve consisting of such Hermite 
interpolation polynomials is called a Bezier curve, 
after the French engineer P. Bezier of the Renault 
Automobile Company, who introduced them in the 
early 1960s in designing car bodies. Bezier curves (and 
surfaces) are used in computer-aided design (CAD) and 
computer-aided manufacturing (CAM). (For more 
details, see Ref. [E21] in App. 1.) 

(b) Find and graph the Bezier curve and its 
guidepoints if A: [0, 0], B: [I, 0], v 0 = [|, |], 

vi = H, -5V3]. 

(c) Changing guidepoints changes C. Moving 
guidepoints farther away makes C “staying near the 
tangents for a longer time.” Confirm this by changing 
v 0 and v x in (b) to 2v 0 and 2v x (see Fig. 436). 

(d) Make experiments of your own. What happens if 
you change Vj in (b) to — v x . If you rotate the tangents? 
If you multiply v 0 and v x by positive factors less 
than 1? 



A b a 

Fig. 436. Team Project 20(b) and (c): Bezier curves 


( 15 ) 
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19.5 Numeric Integration and Differentiation 

Numeric integration means the numeric evaluation of integrals 

J = \ fix) dx 

J a 

where a and b are given and / is a function given analytically by a formula or empirically by 
a table of values. Geometrically, J is the area under the curve of / between a and b (Fig. 437). 

We know that if f is such that we can find a differentiable function F whose derivative 
is /, then we can evaluate J by applying the familiar formula 

r b 

J = I f(x) dx = Fib) - Fia) [F'to = fix)]- 

J a 

Tables of integrals or a CAS (Mathematica, Maple, etc.) may be helpful for this purpose. 

However, applications often lead to integrals whose analytic evaluation would be very 
difficult or even impossible, or whose integrand is an empirical function given by recorded 
numeric values. Then we may obtain approximate numeric values of the integral by a 
numeric integration method. 

Rectangular Rule. Trapezoidal Rule 

Numeric integration methods are obtained by approximating the integrand f by functions 
that can easily be integrated. 

The simplest formula, the rectangular rule, is obtained if we subdivide the interval of 
integration a^x^b into n subintervals of equal length h = (b — a) hi and in each subinterval 
approximate / by the constant /(*/), the value off at the midpoint*/ of the jth subinterval 
(Fig. 438). Then f is approximated by a step function (piecewise constant function), the n 
rectangles in Fig. 438 have the areas /(*/)/?, • • • , /(*/)/*, and the rectangular rule is 


( 1 ) 


J=[ fix) dx « h[fix x *) + fix 2 *) + ■■■ + /(*„*)] 

a 



b — a 
n 


)■ 


The trapezoidal rule is generally more accurate. We obtain it if we take the same 
subdivision as before and approximate / by a broken line of segments (chords) with 
endpoints [a> /(<*)], [* ls /( x x )] y ••■,[&, f(b)] on the curve of / (Fig. 439). Then the area 
under the curve of f between a and b is approximated by n trapezoids of areas 

§[/(«) + /(*!>]*. M/a-i) + /m/i, • • • , M/Mi) + m]h. 



Fig. 437. Geometric interpretation 
of a definite integral 
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EXAMPLE 1 


y | 



By taking their sum we obtain the trapezoidal rule 


(2) J - f fix) dx * h[y(a) + f(x : ) + f(x 2 ) + • • • + f(x n -i) + §/0)] 

J a 


where h = (b - a)/n, as in (1). The jc/s and a and b are called nodes. 
Trapezoidal Rule 

Evaluate J = I e ~ ** dx by means of (2) with n = 10. 

J o 

Solution . J ~ 0.1 (0.5 • 1.367 879 + 6.778 167) = 0.746 21 1 from Table 19.3. 


Table 19.3 Computations in Example 1 


j 

x i 

xj 2 

2 

e~ x > 


0 

0 

0 

1.000000 


1 

0.1 

0.01 


0.990050 

2 

0.2 

0.04 


0.960 789 

3 

0.3 

0.09 


0.913 931 

4 

0.4 

0.16 


0.852 144 

5 

0.5 

0.25 


0.778 801 

6 

0.6 

0.36 


0.697 676 

7 

0.7 

0.49 


0.612 626 

8 

0.8 

0.64 


0.527 292 

9 

0.9 

0.81 


0.444 858 

10 

1.0 

1.00 

0.367 879 


Sums 



1.367 879 

6.778 167 


Error Bounds and Estimate for the Trapezoidal Rule 

An error estimate for the trapezoidal rule can be derived from (5) in Sec. 19.3 with 
n = I by integration as follows. For a single subinterval we have 


fix) - Pi (.V) = (x - ,V 0 )(A- - A'j) 


f"(t ) 


2 
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with a suitable t depending on x , between x 0 and x v Integration over x from a = x Q to 
*i = *o + h gives 

r x ° +h h r , r Xo+h f"(t(x)) 

J f(x) dx - — [/(jc 0 ) + /(*!)] = J (x - x 0 )(x - x Q - h) — - — dx. 

J x 0 * .r 0 

Setting x — a * 0 = v and applying the mean value theorem of integral calculus, which we 
can use because ( x — x 0 )(x - * 0 — h) does not change sign, we find that the right side 
equals 



where t is a (suitable, unknown) value between a* 0 and x v This is the error for the 
trapezoidal rule with n = 1, often called the local error. 

Hence the error e of (2) with any n is the sum of such contributions from the 
n subintervals; since h = (b — a)/n, nh 3 = n(b — df/n z , and ( b — a) 2 = n 2 h 2 , we obtain 


(3) 


6 = 


(b - a) 3 
12n 2 


fit) = 


(fr ~ a) 
12 


h 2 f(t) 


with (suitable, unknown) t between a and b. 

Because of (3) the trapezoidal rule (2) is also written 

(2*) J=[ fix) dx = h[\fia) + fix x ) + • • • + fix n _ x ) + §/(&)] - ^ h 2 f”it). 

J a 

Error Bounds are now obtained by taking the largest value for say, M 2 , and the 
smallest value, M 2 *, in the interval of integration. Then (3) gives (note that K is negative) 

* (b — of b — a 0 

(4) KM 2 = € ~ KM 2 where K = - = — h 2 . 

Error Estimation by Halving h is advisable if h” is very complicated or unknown, 
for instance, in the case of experimental data. Then we may apply the Error Principle 
of Sec. 19.1. That is, we calculate by (2), first with /t, obtaining, say, J = J k + e h , , and 
then with |/t, obtaining J = J hf2 + e hi2 . Now if we replace h 2 in (3) with (|/t) 2 , the error 
is multiplied by 1/4. Hence e hf2 ~ \e h (not exactly because r may differ). Together, 
4/2 + *h /2 = 4 + € h ~ 4 + 4e ;i/2 . Thus 4 /2 - 4 = (4 - Dcfc/a- Division by 3 
gives the error formula for J h/2 


( 5 ) £/i/2 % j ( 4/2 4 )» 

Error Estimation for the Trapezoidal Rule by (4) and (5) 

Estimate the error of the approximate value in Example 1 by (4) and (5). 

Solution . (A) Error bounds by (4). By differentiation, f\x) - 2{2x 2 — Also, /"'(*) > 0 if 

0 < ^ < 1* so that the minimum and maximum occur at the ends of the interval. We compute 
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M 2 = /"( t) = 0.735 759 and M 2 * = /"(0) = -2. Furthermore, K = - 1/1200. and (4) gives 

—0.000 614 ^ e ^ 0.001 667. 

Hence the exact value of J must lie between 

0.746 21 1 - 0.000 614 = 0.745 597 and 0.746 21 1 -F 0.001 667 = 0.747 878. 


Actually, J = 0.746 824, exact to 6D. 

(B) Error estimate by (5). J h = 0.7462 1 1 in Example I . Also, 


J}i(2 ~~ 0.05 


2 e“^ /20)2 + y(l+ 0.367879) 
Li-3 


= 0.746671. 


Hence € h/2 = \Uht2 ~ = 0.000153 and J h/2 + € hf2 = 0.746824, exact to 6D. ■ 

Simpson’s Rule of Integration 

Piecewise constant approximation of f led to the rectangular rule (1), piecewise linear 
approximation to the trapezoidal rule (2), and piecewise quadratic approximation will lead 
to Simpson’s rule, which is of great practical importance because it is sufficiently accurate 
for most problems, but still sufficiently simple. 

To derive Simpson’s rule, we divide the interval of integration cr^x^b into an even 
number of equal subintervals, say, into n = 2m subintervals of length h = (b — a)/(2m ), 
with endpoints x 0 (= a), x l9 • • • , x 2m _i, x 2 m ( = b)\ see Fig. 440. We now take the first 
two subintervals and approximate /(a) in the interval a 0 ^ a* ^ x 2 = a 0 + 2 It by the 
Lagrange polynomial p 2 (x) through (a 0 , /o)> (*i» fi\ (a 2 , f 2 ), where fj = /(a,-). From (3) 
in Sec. 1 9.3 we obtain 


( 6 ) 


P2W = 


(a - x x )(x - a 2 ) 
(a 0 - a 1 )(a 0 - a 2 ) 


fo + 


(A - a‘ 0 )(a - a 2 ) + 

U'l - Ao)(A! - a 2 ) Jl 


(A - Aq)(a - Ax) 

( x 2 “ *o)(*2 - *l) j2 ‘ 


The denominators in (6) are 2 h 2 , — h 2 , and 2 /? 2 , respectively. Setting s = (a — a { j / h , we 
have 

x — x 1 = sh, a — a 0 = a — (aj — h) = (s 4- \)h 

x — a 2 = a — (a*i + h) = ( s ~ l)/7 

and we obtain 

P 2 W = H* - D/o - (S + D(* - l)/l + Us + 1)^2- 



X 
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We now integrate with respect to x from x 0 to x 2 . This corresponds to integrating with 
respect to s from —1 to 1. Since dx = h ds, the result is 

(7*) J fix) dx « J p 2 (x) dx = h ^ J f 0 + j fi + j . 

A similar formula holds for the next two subintervals from x 2 to -v 4 , and so on. By summing 
all these m formulas we obtain Simpson’s rule 4 

r b h 

(7) J fix) dx = - (f 0 + 4fi + 2/ 2 + 4/ 3 + • • • + 2/ 2 , n _ 2 + 4/am-i + / 2 m). 

J a 3 

where h — {b — a)/ {2m) and fj = f{xj). Table 19.4 shows an algorithm for Simpson's 
rule. 


Table 19.4 Simpson’s Rule of Integration 

ALGORITHM SIMPSON (a, b , m, / 0 , / lf • • • , f 2m ) 

This algorithm computes the integral J = /£/(*) dx from given values fj = f(xj) at 
equidistant .v 0 = a , x x = a* 0 + h, • • • , A 2m = a 0 + 2m/i = Z> by Simpson’s rule (7), 
where /? = (b — a)/{2m). 

INPUT: a. b t m, /o, • • • , / 2w 

OUTPUT: Approximate value J of J 

Compute *(, = fo + / 2 m 

•Vi “ /i + h + * ’ * + /am— l 

*2 = fz + U + • * • + / 2m— 2 
h = (b — g )/ 2 j 7 j 

J = y (i'o + 4^ + 2i2) 

OUTPUT I Stop. 

End SIMPSON 


Error of Simpson’s Rule (7). If the fourth derivative / (4) exists and is continuous on 
a ^ a* ^ b, the error of (7), call it e s , is 


( 8 ) 


€ s - ~ 


(± ~ of 
1 80 (2m) 4 


/< 4) (?) = - 


(6 ~ *?) 
180 


* 4 / <4) (?); 


THOMAS SIMPSON (17I0— 1761>, self-taught English mathematician, author of several popular textbooks. 
Simpson's rule was used much earlier by Torricelli. Gregory (in 1668), and Newton (in 1676). 
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here t is a suitable unknown value between a and b . This is obtained similarly to (3). With 
this we may also write Simpson’s rule (7) as 


(7**) 



dx - y (/o + 4/i + 


+ /2m) “ 


(fr ~ A) 
180 


/? 4 / (4) (f). 


Error Bounds. By taking for / (4) in (8) the maximum M 4 and minimum M 4 * on the 
interval of integration we obtain from (8) the error bounds (note that C is negative) 


(9) CM 4 ^ e s ^ CM 4 * where C 


(fr ~ af 

180 (2m) 4 


(fr ~ a) 
180 


h\ 


Degree of Precision (DP) of an integration formula . This is the maximum degree of 
arbitrary polynomials for which the formula gives exact values of integrals over any 
intervals. 

Hence for the trapezoidal rule. 


DP = l 

because we approximate the curve of / by portions of straight lines (linear polynomials). 
For Simpson’s rule we might expect DP = 2 (why?). Actually, 

DP = 3 

by (9) because f 4) is identically zero for a cubic polynomial. This makes Simpson’s rule 
sufficiently accurate for most practical problems and accounts for its popularity. 

Numeric Stability with respect to rounding is another important property of Simpson’s 
rule. Indeed, for the sum of the roundoff errors €j of the 2m 4- 1 values fj in (7) we obtain, 
since h — {b — a) 1 2m, 

h . , {b — a) 

J ko + 4 €j. + • • • + e 2 J ^ . — 6 mu = (b - a)u 

where u is the rounding unit (it = \ ■ 10 -6 if we round off to 6D; see Sec. 19.1). Also 
6 = 1 4- 4 4- 1 is the sum of the coefficients for a pair of intervals in (7); take m = 1 in 
(7) to see this. The bound (b — a)u is independent of m, so that it cannot increase with 
increasing m, that is, with decreasing h. This proves stability. ■ 

Newton-Cotes Formulas. We mention that the trapezoidal and Simpson rules are 
special closed Newton-Cotes formulas , that is, integration formulas in which f(x) is 
interpolated at equally spaced nodes by a polynomial of degree n (n — 1 for trapezoidal, 
n = 2 for Simpson), and closed means that a and b are nodes (a = ,v 0 , b = x n ). n = 3 
(the three-eighths rule; Review Prob. 33) and a higher n are used occasionally. From 
11 = 8 on, some of the coefficients become negative, so that a positive f d could make a 
negative contribution to an integral, which is absurd. For more on this topic see Ref. [E25] 
in App. 1. 
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EXAMPLE 


Simpson’s Rule. Error Estimate 

r 1 ^ 

Evaluate J = I e dx by Simpson's rule with 2m = 10 and estimate the error. 

•'o 

Solution . Since /j = CU, Table 19.5 gives 

( 1 .367 879 + 4 • 3.740 266 + 2 • 3.037 901) = 0.746 825. 

Estimate of error. Differentiation gives /^(.v) = 4(4.v 4 - 12v 2 + 3)^”^. By considering the derivative / 5) 
of f 4) we find that the largest value of / 4) in the interval of integration occurs at 0 and the smallest value at 
a** = (2.5 — 0.5Vl0) 1/2 . Computation gives the values M 4 = /^(O) = 12 and A/ 4 * = /^(x*) = -7.419. 
Since 2m = 10 and b — a - I, we obtain C = -J/l 800 000 = —0.000 000 56. Therefore, from (9), 


-0.000 007 ^ e s = 0-000 005. 


Hence J must lie between 0.746 825 - 0.000 007 = 0.746 818 and 0.746 825 + 0.000 005 = 0.746 830, 
so that at least four digits of our approximate value are exact Actually, the value 0.746 825 is exact to 5D because 
J = 0.746 824 (exact to 6D). 

Thus our result is much better than that in Example 1 obtained by the trapezoidal rule, whereas the number 
of operations is nearly the same in both cases. M 


Table 19.5 Computations in Example 3 


j 


x f 


e~ x ? 


0 

0 

0 

1.000 000 



1 

0.1 

0.01 


0.990050 


2 

0.2 

0.04 



0.960 789 

3 

0.3 

0.09 


0.913 931 


4 

0.4 

0.16 



0.852 144 

5 

0.5 

0.25 


0.778 801 


6 

0.6 

0.36 



0.697 676 

7 

0.7 

0.49 


0.612 626 


8 

0.8 

0.64 



0.527 292 

9 

0.9 

0.81 


0.444 858 


10 

1.0 

1.00 

0.367 879 



Sums 



1.367 879 

3.740266 

3.037 901 


Instead of picking an n = 2m and then estimating the error by (9), as in Example 3, it is 
better to require an accuracy (e.g., 6D) and then determine n = 2m from (9). 

Determination of n — 2m in Simpson’s Rule from the Required Accuracy 

What n should we choose in Example 3 to get 6D-accuracy? 

Solution . Using = 12 (which is bigger in absolute value than A^*), we get from (9), with b — a = 1 
and the required accuracy, 


\cm 4 \ = 


12 

1 80 (2m) 4 



thus 


m — 


2 * 10 6 -12 
1 80 • 2 4 



= 9.55. 


Hence we should choose n — 2/m = 20. Do the computation, which parallels that in Example 3. 

Note that the error bounds in (4) or (9) may sometimes be loose, so that in such a case a smaller n - 2m 
may already suffice. m 
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EXAMPLE 5 


EXAMPLE 6 


Error Estimation for Simpson’s Ruie by Halving A. The idea is the same as in (5) 
and gives 


( 10 ) 


6/1/2 ^ 15 ^ hl2 ^)' 


J h is obtained by using A and J hf2 by using \h , and e hf2 is the error of J hf2 - 
Derivation, In (5) we had § as the reciprocal of 3 = 4 — 1 and ^ = (|) 2 resulted from 
A 2 in (3) by replacing A with \h. In (10) we have as the reciprocal of 15 = 16 — 1 and 
jg = (I) 4 results from /? 4 in (8) by replacing h with \h. 

Error Estimation for Simpson’s Rule by Halving 

Integrate /(. x) = Jtt.v 4 cos Jtt.v from 0 to 2 with h — 1 and apply (10). 

Solution . The exact 5D- value of the integral is J = 1.25953. Simpson’s rule gives 

Jjl = Um + 4/(1) 4 - /( 2)] = 1(0 4 - 4 • 0.555360 4 - 0) = 0.740480. 



m + 4/ 



4 - 2/(1) 4 - 4/ 



= -[0 4 - 4- 0.045351 4 - 2 • 0.555361 + 4 • 1.521579 4 - 0] = 1.22974. 
6 


Hence (10) gives e h!2 = ^(1.22974 - 0.74048) = 0.032617 and thus J == J }U2 + *h/2 m 1-26236, with an 
error —0.00283, which is less in absolute value than of the error 0.02979 of J iif2 . Hence the use of (10) was 
well worthwhile. ■ 


Adaptive Integration 

The idea is to adapt step A to the variability of /(x). That is, where f varies but little, we 
can proceed in large steps without causing a substantial error in the integral, but where f 
varies rapidly, we have to take small steps in order to stay everywhere close enough to 
the curve of /. 

Changing h is done systematically, usually by halving A, and automatically (not “by 
hand”) depending on the size of the (estimated) error over a subinterval. The subinterval 
is halved if the corresponding error is still too large, that is, larger than a given tolerance 
TOL (maximum admissible absolute error), or is not halved if the error is less than or 
equal to TOL. 

Adapting is one of the techniques typical of modem software. In connection with 
integration it can be applied to various methods. We explain it here for Simpson’s rule. 
In Table 19.6 a star means that for that subinterval, TOL has been reached. 

Adaptive Integration with Simpson’s Rule 

Integrate /(.v) = cos ^tt.x from a* = 0 to 2 by adaptive integration and with Simpson's rule and 

TOL[0. 2] = 0.0002. 

Solution . Tabic 1 9.6 shows the calculations. Figure 44 1 shows the integrand /(.v) and the adapted intervals 
used. The first two intervals ([0. 0.5], [0.5, 1.0]) have length 0.5, hence /? = 0.25 [because we use 2m = 2 
subintervals in Simpson's rule (7**)]. The next two intervals ([1.00, 1.25], [1.25, 1.50]) have length 0.25 (hence 
h = 0.J25) and the last four intervals have length 0.125. Sample computations. For 0.740480 see Example 5. 
Formula (10) gives (0.123716 - 0.122794)715 = 0.000061. Note that 0.123716 refers to [0. 0.5] and [0.5, I], 
so that we must subtract the value corresponding to [0, I] in the line before. Etc. TOL[0, 2] = 0.0002 gives 
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0.0001 for subintervals of length 1 , 0.00005 for length 0.5, etc. The value of the integral obtained is foe sum of 
the values marked by an asterisk (for which the error estimate has become less than TOL). This gives 

J « 0.123716 + 0.528895 + 0.388263 + 0,218483 = 1.25936. 

The exact 5D-value is / = 1.25953. Hence the error is 0.00017. This is about 1/200 of the absolute value of 
that in Example 5. Our more extensive computation has produced a much better result. I 


Table 19.6 Computations in Example 6 


Interval 

Integral 

Error (10) 

TOL 

Comment 

[0,2] 


0.740480 


0.0002 


[0, 1] 
[1,2] 

Sum = 

0.122794 

1.10695 

1.22974 

0.032617 

0.0002 

Divide further 

[0.0, 0.5] 
[0.5, 1.0] 

Sum = 

0.004782 

0.118934 

0.123716* 

0.000061 

0.0001 

TOL reached 

[1.0, 1.51 
[1.5, 2.0] 

Sum = 

0.528176 

0.605821 

1.13300 

0.001803 

0.0001 

Divide further 

[1.00, 1.25] 
[1.25, 1.50] 

Sum = 

0.200544 

0.328351 

0.528895* 

0.000048 

0.00005 

TOL reached 

[1.50, 1.75] 
[1.75, 2.00] 

Sum = 

0.388235 

0.218457 

0.606692 

0.000058 

0.00005 

Divide further 

[1.500, 1.625] 
(1.625, 1.750] 

Sum = 

0.196244 

0.192019 

0.388263* 

0.000002 

0.000025 

TOL reached 

[1.750, 1.875] 
[1.875, 2.000] 

Sum = 

0.153405 

0.065078 

0.218483* 

0.000002 

0.000025 

TOL reached 


fix) 

1.5 






1.0 

- 





0.5 
n t 


i , i , . . 

1 

1 



0 c 

) 075 

1 1.5 

2 x 



Fig. 441. Adaptive integration in Example 6 









826 


CHAP. 19 Numerics in General 


EXAMPLE 7 


Gauss Integration Formulas 
Maximum Degree of Precision 

Our integration formulas discussed so far use function values at predetermined 
(equidistant) ^-values (nodes) and give exact results for polynomials not exceeding a 
certain degree [called the degree of precision; see after (9)]. But we can get much more 
accurate integration formulas as follows. We set 


n 

m di-'Z Ajfj [fj = f( tj )] 

j=i 

with fixed /?, and t = ± 1 obtained from x : = a, b by setting x = \[a{t — 1) + b(t + 1)]. 
Then we determine the n coefficients A x , • • • , A n and n nodes / 1? • • • , t n so that (11) 
gives exact results for polynomials of degree k as high as possible. Since « + n = 2 n is 
the number of coefficients of a polynomial of degree 2 n — 1, it follows that k^2n — 1. 

Gauss has shown that exactness for polynomials of degree not exceeding 2n — 1 (instead 
of /t — 1 for predetermined nodes) can be attained, and he has given the location of the 
tj (= the jth zero of the Legendre polynomial P n in Sec. 5.3) and the coefficients Aj which 
depend on n but not on /(/), and are obtained by using Lagrange’s interpolation polynomial, 
as shown in Ref. [E5] listed in App. 1. With these tj and Aj, formula (1 1) is called a Gauss 
integration formula or Gauss quadrature formula. Its degree of precision is 2/7 — 1, as 
just explained. Table 19.7 gives the values needed for n . = 2, • • • , 5. (For larger n , see 
pp. 916—919 of Ref. [GRl] in App. 1.) 


Table 19.7 Gauss Integration: Nodes ty and Coefficients Ay 


11 

Nodes tj 

Coefficients Aj 

Degree of Precision 

2 

-0.57735 02692 
0.57735 02692 

1 

1 

3 


-0.77459 66692 

0.55555 55556 


3 

0 

0.88888 88889 

5 


0.77459 66692 

0.55555 55556 



-0.86113 63116 

0.34785 48451 


4 

-0.33998 10436 
0.33998 10436 

0.65214 51549 
0.65214 51549 

7 


0.86113 63116 

0.34785 48451 



-0.90617 98459 

0.23692 68851 



-0.53846 93101 

0.47862 86705 


5 

0 

0.56888 88889 

9 


0.53846 93101 

0.47862 86705 



0.90617 98459 

0.23692 68851 




Gauss Integration Formula with n = 3 

Evaluate the integral in Example 3 by the Gauss integration formula (II) with n = 3. 

Solution . We have to convert our integral from 0 to I into an integral from -1 to 1. We sel.v = £(r + 1). 
Then dx = \ dt, and (I !) with // = 3 and the above values of the nodes and the coefficients yields 
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(exact to 6D: 0.746 825). which is almost as accurate as the Simpson result obtained in Example 3 with a much 
larger number of arithmetic operations. With 3 function values (as in this example) and Simpson’s rule we would 
get ^(1 + 4e -0 ' 25 + e -1 ) = 0.747 180, with an error over 30 times that of the Gauss integration. H 

Gauss Integration Formula with n = 4 and 5 

Integrate f(x) = £tt.v 4 cos ^tt.v from x - 0 to 2 by Gauss. Compare with the adaptive integration in Example 6 
and comment. 

Solution, x — t + 1 gives f(r) = 57 ir(f + 1 ) 4 cos + 1 )). as needed in (1 1 ). For n = 4 we calculate (6S) 

J ~ A 1 f l + • • • + A4/4 = A1C/1 + / 4 ) + A 2 {f 2 + f z ) 

= 0.347855(0.000290309 + 1.02570) + 0.652145(0.129464 + 1.25459) = 1.25950. 

The error is 0.00003 because ./ = 1.25953 (6S). Calculating with 10S and n = 4 gives the same result: so the 
error is due to the formula, not rounding. For n = 5 and 10S we get J ^ 1.25952 6185. too large by the amount 
0.00000 0250 because J = 1.25952 5935 (10S). The accuracy is impressive, particularly if we compare the 
amount of work with that in Example 6. M 


Gauss integration is of considerable practical importance. Whenever the integrand / is 
given by a formula (not just by a table of numbers) or when experimental measurements 
can be set at times tj (or whatever t represents) shown in Table 19.7 or in Ref. [GR1], 
then the great accuracy of Gauss integration outweighs the disadvantage of the complicated 
tj and Aj (which may have to be stored). Also, Gauss coefficients Aj are positive for all 
/z, in contrast with some of the Newton-Cotes coefficients for larger n . 

Of course, there are frequent applications with equally spaced nodes, so that Gauss 
integration does not apply (or has no great advantage if one first has to get the tj in (1 1) 
by interpolation). 

Since the endpoints — 1 and 1 of the interval of integration in (1 1) are not zeros of P n , 
they do not occur among ’ * * * and the Gauss formula (11) is called, therefore, an 
open formula, in contrast with a closed formula, in which the endpoints of the interval 
of integration are t 0 and t n . [For example, (2) and (7) are closed formulas.] 


Numeric Differentiation 

Numeric differentiation is the computation of values of the derivative of a function / 
from given values of /. Numeric differentiation should be avoided whenever possible, 
because, whereas integration is a smoothing process and is not affected much by small 
inaccuracies in function values, differentiation tends to make matters rough and generally 
gives values of f much less accurate than those of / — remember that the derivative is 
the limit of the difference quotient, and in the latter you usually have a small difference 
of large quantities that you then divide by a small quantity. However, the formulas to be 
obtained will be basic in the numeric solution of differential equations. 

We use the notations // = f'(Xj\ f"= f”(Xj\ etc., and may obtain rough approximation 
formulas for derivatives by remembering that 


This suggests 


f(x) = lim 
h — *0 


/(* + h) - f(x) 
h 
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/iri ,/ Sfm _ fi “ fo 

(12) f m - — - — — • 

Similarly, for the second derivative we obtain 

(13) f ~ & - h^k±h etc 

(13) fl ~ h 2 ~ h 2 ’ 6tC - 

More accurate approximations are obtained by differentiating suitable Lagrange 
polynomials. Differentiating (6) and remembering that the denominators in (6) are 2 h 2 , 
— /r 2 , 2 h 2 y we have 

. . 2x — x 1 — Xo 2x — x n — xo 2x — x n — 

fw ~ P 2 M = — ^ — - /o ^ — - h + — ~2 — 1 u 

Evaluating this at jt 0 , x v x 2 , we obtain the “three-point formulas” 

(a) s '° ~ in ( _3/o + 4/1 _ /2> ’ 

(14) (b) fi ~ L <-/ 0 + / 2 ), 

(c) fi - ^ (/o - 4/ x + 3/ 2 ). 

Applying the same idea to the Lagrange polynomial p 4 (x), we obtain similar formulas, 
in particular. 


(/o ” 8/1 + 8/3 " /4) - 


Some examples and further formulas are included in the problem set as well as in 
Ref. [E5] listed in App. 1. 


ROBLEM SET 19.5 


1. (Rectangular rule) Evaluate the integral in Example 
1 by the rectangular rule ( 1 ) with a subinterval of length 
0 . 1 . 

2. Derive a formula for lower and upper bounds for die 
rectangular rule and apply it to Prob. 1 . 

[3^8] TRAPEZOIDAL AND SIMPSON’S RULES 

Evaluate the integrals numerically as indicated and 
determine the error by using an integration formula known 
from calculus. 



3. F( 2) by (2), n = 10 

4. F{2) by (7), n = 10 

5. G(l) by (2), n = 10 

6. G(l) by (7), n = 10 

7. H( 4) by (2), n = 10 

8. H( 4) by (7), n = 10 

9-121 HALVING 

Estimate the error by halving. 

9. In Prob. 5 

10. Tn Prob. 6 

11. In Prob. 7 

12. In Prob. 8 
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1 13—19 [ NONELEMENTARY INTEGRALS 

The following integrals cannot be evaluated by the usual 
methods of calculus. Evaluate them as indicated. 


f x sinj 
Si(.v) = J — 

•'ft A 


dx*. 


S(a) - f sin (a* 2 ) */a*, C(a) = f cos (a* 2 ) dx* 

An An 


Si(a) is the sine integral. S(a) and C(.v) are the Fresnel 
integrals. (See App. 3.1.) 

13. Si(l) by (2). n = 5, ;i = 10 

14. Using the values in Prob. 13, obtain a better value for 
Si(l). Hint. Use (5). 

15. Si(l ) by (7), 2m = 2, 2 m = 4 

16. Obtain a better value in Prob. 15. Hint. Use (10). 

17. Sid) by (7). 2m = 10 

18. S( 1.25) by (7). 2m = 10 

19. C( 1 .25) by (7). 2m = 10 


20. (Stability) Prove that the trapezoidal rule is stable with 
respect to rounding. 


21-24 


GAUSS INTEGRATION 

Integrate by (11) with n = 5: 

21. 1/a* from 1 to 3 

22. cos a* from 0 to 57 r 

23. e~ x * from 0 to 1 

24. sin (.v 2 ) from 0 to 1 .25 


25. (Given TOL) Find the smallest n in computing the 
integral of 1/a* from 1 to 2 for which 5D-accuracy is 
guaranteed (a) by (4) in the use of (2). (b) by (9) in the 
use of (7). Compare and comment. 

26. TEAM PROJECT. Romberg Integration (W. 
Romberg, Norske Videnskab. Trondheim , F0H1. 28. 
Nr. 7, 1955). This method uses the trapezoidal rule and 
gains precision stepwise by halving h and adding an 
error estimate. Do this for the integral of /(a) = e~ x 
from a = 0 to a* = 2 with TOL = 10~ 3 , as follows. 

Step L Apply the trapezoidal rule (2) with h = 2 
(hence n = 1 ) to get an approximation J n . Halve h and 
use (2) to get J 2 1 and an error estimate 


If |€ 32 | ^ TOL, stop. The result is ^33 — ^32 “b € 32- 
(Why does 2 4 = 16 come in?) Show that we obtain 
€32 = —0.000266. so that we can stop. Arrange your 
J - and e-values in a kind of “difference table.” 



If |e 32 | were greater than TOL, you would have to 
go on and calculate in the next step J 41 from (2) with 
h = 3; then 


1 


$ 

+ 

II 

4 

with 

641 ~~ 3 ^ 41 

J43 — J42 “h € 42 

with 

e 42 = (^42 ~ 7 32 ) 

J44 = J43 + 643 

with 

1 

6 43 “ (^43 ~ ^33) 

where 63 = 2 6 - 1. (How does this come in?) 

Apply the Romberg method to the integral of 
/(a) = 4 7TA* 4 cos \ttx from a = 0 to 2 with TOL = 10“ 4 . 

DIFFERENTIATION 

27. Consider f(x) = a 4 

for .r 0 

— 0, A*i — 0.2, a 2 — 0.4, 


a 3 = 0.6. a 4 = 0.8. Calculate jf 2 from (14a), (14b), 
(14c), (15). Determine the errors. Compare and 
comment. 

28. A “four-point formula” for the derivative is 

/s - L (-2/, - 3/2 + 6/3 - u ). 

Apply it to /(a) = a 4 with x ly • • • , a 4 as in Prob. 27, 
determine the error, and compare it with that in the case 
of (15). 

29. The derivative f'(x) can also be approximated in terms 
of first-order and higher order differences (see 
Sec. 19.3): 


e 2i ” 2 2 - 1 ^ 21 

If |e 21 | = TOL, stop. The result is J22 = ^21 + ^21- 
Step 2. Show that e 21 = -0.066596, hence 
l c 2il ^ TOL and go on. Use (2) with h/4 to get / 31 and 
add to it the error estimate e 3 j = 5(7 31 — J 2 1) to get 
the better 7 32 = J 31 + e 31 . Calculate 
I /f r I 

€ 32 “ 94 _ j (^32 ^22) — (7 32 — 7 22 ). 


/'( A-o) = J (a/o - J A 2 /o 

+ J A3/ o - \ A “/o + ) • 

Compute /’(0.4) in Prob. 27 from this formula, using 
differences up to and including first order, second 
order, third order, fourth order. 

30. Derive the formula in Prob. 29 from (14) in Sec. 19.3. 
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1. What is a numeric method? How has the computer 
influenced numeric methods? 

2. What is floating-point representation of numbers? 
Overflow and underflow? 

3. How do error and relative error behave under addition? 
Under multiplication? 

4. Why are roundoff errors important? State the rounding 
rules. 

5. What is an algorithm? Which of its properties are 
important in software implementation? 

6. Why is the selection of a good method at least as 
important on a large computer as it is on a small one? 

7. Explain methods for solving equations, in particular 
fixed-point iteration and its convergence. 

8. Can the Newton (-Raphson) method diverge? Is it fast? 
Same questions for the bisection method. 

9. What is the advantage of Newton’s interpolation 
formulas over Lagrange’s? 

10. What do you remember about errors in polynomial 
interpolation? 

11. What is spline interpolation? Its advantage over 
polynomial interpolation? 

12. List and compare numeric integration methods. When 
would you apply them? 

13. In what sense is Gauss integration optimal? Explain 
details. 

14. What does adaptive integration mean? Why is it useful? 

15. Why is numeric differentiation generally more delicate 
than numeric integration? 

16. Write —0.35287, 1274.799, -0.00614, 24.9482, 1/3, 
85/7 in floating-point form with 5S (5 significant digits, 
properly rounded). 

17. Compute (5.346 — 3.644)/(3.454 — 3.055) as given and 
then rounded stepwise to 3S, 2S, 1 S. (“Stepwise” means 
rounding the four rounded numbers, not the given ones.) 
Comment on your results. 

18. Compute 0.29731/(4.1232 — 4.0872) with the numbers 
as given and then rounded stepwise (that is, rounding 
the rounded numbers) to 4S, 3S, 2S. Comment. 

19. Solve a * 2 — 50* + 1 = 0 by (6) and by (7) in Sec. 19.1, 
using 5S in the computation. Compare and comment. 

20. Solve * 2 - 200* + 4 = 0 by (6) and by (7) in Sec. 
19.1, using 5S in the computation. Compare and comment. 

21. Let 4.81 and 12.752 be correctly rounded to the number 
of digits shown. Determine the smallest interval in 
which the sum (using the true instead of the rounded 
values) must lie. 

22. Answer the question in Prob. 21 for the difference 
4.81 - 12.752. 


TIONS AND PROBLEMS 


23. What is the relative error of na in terms of that of al 

24. Show that the relative error of a 2 is about twice that of 
a. 

25. Compute the solution of * 5 = * + 0.2 near * = 0 by 
transforming the equation into the form * = g(x) and 
starting from * 0 - 0. (Use 6S.) 

26. Solve cos * = * by iteration (6S, * 0 = 1 )* writing it as 
x = (0.74* 4- cos*)/1.74, obtaining * 4 = 0.739 085 
(exact to 6SI). Why does this converge so rapidly? 

27. Solve * 4 — * 3 — 2* — 34 = 0 by Newton’s method 
with *o = 3 and 6S accuracy. 

28. Solve cos * — * = 0 by the method of false position. 

29. Compute /( 1.28) from 

/(1.0) = 3.00000 
/( 1.2) = 2.98007 
f(\A) = 2.92106 
/(L6) = 2.82534 
/(l. 8) = 2.69671 
/(2.0) = 2.54030 

by linear interpolation. By quadratic interpolation, using 
/(1-2). /(1.4), /(1.6). 

30. Find the cubic spline for the data 

/(- 0 = 3 
/(l) = 1 
/(3) = 23 
/(5) = 45 
*0 = *3 = 3. 

31. Compute the integral of* 3 from 0 to 1 by the trapezoidal 
rule with n = 5. What error bounds are obtained from 
(4) in Sec. 1 9.5? What is the actual error of the result? 
Why is this result larger than the exact value? 

32. Compute the integral of cos(* 2 ) from 0 to 1 by 
Simpson’s rule with 2m — 2 and 2m = 4 and estimate 
the eiTor by (10) in Sec. 19.5. (This is the Fresnel 
integral (38) in App. 3.1 with * =1.) 

33. Compute the integral of cos* from 0 to \ir by the 
three-eights rule 

f b 3 

I fix) dx = - h(f 0 + 3 f x + 3 / 2 + f z ) 

J a * 

- ^ (* - a)h 4 f (M (?) 

and give error bounds; here a £ / S b and 
xj = a + (b- a)j/3,j = 0, • • • , 3. 
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pnB OF CHAPTER 19 

Numerics in General 


In this chapter we discussed concepts that are relevant throughout numeric work as 
a whole and methods of a general nature, as opposed to methods for linear algebra 
(Chap. 20) or differential equations (Chap. 21). 

In scientific computations we use the floating-point representation of numbers 
(Sec. 19.1); fixed-point representation is less suitable in most cases. 

Numeric methods give approximate values a of quantities. The error e of a is 

(1) e = a-a (Sec. 19.1) 


where a is the exact value. The relative error of a is e/a. Errors arise from rounding, 
inaccuracy of measured values, truncation (that is, replacement of integrals by sums, 
series by partial sums), and so on. 

An algorithm is called numerically stable if small changes in the initial data give 
only correspondingly small changes in the final results. Unstable algorithms are 
generally useless because errors may become so large that results will be very 
inaccurate. Numeric instability of algorithms must not be confused with 
mathematical instability of problems (“ill-conditioned problems,” Sec. 19.2). 

Fixed-point iteration is a method for solving equations f(x ) = 0 in which the 
equation is first transformed algebraically to x = g(x), an initial guess a 0 for the 
solution is made, and then approximations x x , .v 2 , • * * , are successively computed 
by iteration from (see Sec. 19.2) 


(2) 


*n+l = 8(*n) 


(n = 0, 1, • • •)• 


Newton’s method for solving equations /( x) = 0 is an iteration 


( 3 ) 


fa) 

/'(An) 


(Sec. 19.2). 


Here A n+1 is the A-intercept of the tangent of the curve y = f(x) at the point x n . 
This method is of second order (Theorem 2, Sec. 19.2). If we replace f in (3) by 
a difference quotient (geometrically: we replace the tangent by a secant), we obtain 
the secant method; see ( 10) in Sec. 19.2. For the bisection method (which converges 
slowly) and the method of false position , see Problem Set 19.2. 

Polynomial Interpolation means the determination of a polynomial p n (x) such 
that p n (xj) = fj, where j = 0, • • ■ , n and (a* 0 , / 0 ), * * * , (x n , f n ) are measured or 
observed values, values of a function, etc .p n (x) is called an interpolation polynomial 
For given data. p n (x) of degree n (or less) is unique. However, it can be written in 
different forms, notably in Lagrange’s form (4), Sec. 19.3, or in Newton’s divided 
difference form (10), Sec. 19.3, which requires fewer operations. For regularly 
spaced a 0 , x x = x 0 + K • * • , x n = x 0 -f- nli the latter becomes Newton’s forward 
difference formula (formula (14) in Sec. 19.3) 
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/*(/* — 1) •••(/■ — n + 1) 

(4) f(x) - Pn ( x) = / 0 + rA/o + • • * + ; & fo 

n! 

where r = (x — x 0 )/h and the forward differences are A fj = fj +l — fj and 

A % = A fc-1 / J+1 - A k ~% (* = 2, 3, • • •)• 

A similar formula is Newton's backward difference interpolation formula (formula 
(18) in Sec. 19.3). 

Interpolation polynomials may become numerically unstable as n increases, and 
instead of interpolating and approximating by a single high-degree polynomial it is 
preferable to use a cubic spline #(a), that is, a twice continuously differentiable 
interpolation function [thus, g(Xj) = fj], which in each subinterval Xj ^ a* ^ Aj +1 
consists of a cubic polynomial qj(x)\ see Sec. 19.4. 

Simpson’s rule of numeric integration is [see (7), Sec. 19.5] 
r b . h 

(5) J fix) dx = — (/o + 4/i 4- 2/ 2 + 4/3 + • • • + 2/ 2m _ 2 + 4/2,,^! + f 2m ) 

J a ^ 


with equally spaced nodes xj = x 0 + jh, j = 1, • • • , 2m, h = (b — a)/(2m), and 
fj = f(Xj). It is simple but accurate enough for many applications. Its degree of 
precision is DP = 3 because the error (8), Sec. 19.5, involves /? 4 . A more practical 
error estimate is (10), Sec. 19.5, 

6)1/2 ~ T5 ^ h!2 ~~ 

obtained by first computing with step h, then with step /z/2, and then taking 1/15 of 
the difference of the results. 

Simpson’s rule is the most important of the Newton-Cotes formulas, which are 
obtained by integrating Lagrange interpolation polynomials, linear ones for 
the trapezoidal rule (2), Sec. 19.5, quadratic for Simpson’s rule, cubic for the 
three-eights rule (see the Chap. 19 Review Problems), etc. 

Adaptive integration (Sec. 19.5, Example 6) is integration that adjusts 
(“adapts”) the step (automatically) to the variability of /(a*). 

Romberg integration (Team Project 26, Problem Set 19.5) starts from the 
trapezoidal rule (2), Sec. 19.5, with /t, h! 2, W4, etc. and improves results by 
systematically adding error estimates. 

Gauss integration (11), Sec. 19.5, is important because of its great accuracy 
(DP = 2n — 1, compared to Newton-Cotes’s DP = /; — 1 or n). This is achieved 
by an optimal choice of the nodes, which are not equally spaced; see Table 19.7, 
Sec. 19.5. 

Numeric differentiation is discussed at the end of Sec. 1 9.5. (Its main application 
(to differential equations) follows in Chap. 21.) 




CHAPTER 2 0 
Numeric Linear Algebra 


In this chapter we explain some of the most important numeric methods for solving linear 
systems of equations (Secs. 20.1-20.4), for fitting straight lines or parabolas (Sec. 20.5), 
and for matrix eigenvalue problems (Secs. 20.6-20.9). These methods are of considerable 
practical importance because many problems in engineering, statistics, and elsewhere lead 
to mathematical models whose solution requires methods of numeric linear algebra. 

COMMENT. This chapter is independent of Chap . 19 and can be studied immediately 
after Chap . 7 or 8. 

Prerequisite: Secs. 7.1. 7.2, 8.1. 

Sections that may be omitted in a shorter course: 20.4, 20.5, 20.9 
References and Answers to Problems: App. 1 Part E, App. 2 

20.1 Linear Systems: Gauss Elimination 

A linear system of n equations in n unknowns , x n is a set of equations 

E x , • • • , of the form 


Ei: 

tfllA-i + • * 

• + «in*n = b 1 

E 2 : 

a 2\ x l * * 

+ 02n x n ~ &2 

E„: 

fl» 1*1 + * * 

a nn x n 


where the coefficients a jk and the bj are given numbers. The system is called homogeneous 
if all the bj are zero; otherwise it is called nonhomogeneous. Using matrix multiplication 
(Sec. 7.2), we can write ( 1 ) as a single vector equation 

(2) Ax = b 

where the coefficient matrix A = [a jk ] is the n X n matrix 


”«11 

a 12 


a ln~ 


x i 


“*r 

^21 

a 22 


a 2n 

, and x = 

. 

and and b = 


_^nl 

a n2 


^nn_ 




-K_ 
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are column vectors. The following matrix A is called the augmented matrix of the 
system (1): 


«11 • • • <hn b t 


A = [A b] 


a 21 


«2« l> 2 


L^nl 


a nn bnj 


A solution of (1) is a set of numbers x\, • • • , that satisfy all the n equations, and a 
solution vector of (1) is a vector x whose components constitute a solution of (1). 

The method of solving such a system by determinants (Cramer’s rule in Sec. 7.7) is 
not practical, even with efficient methods for evaluating the determinants. 

A practical method for the solution of a linear system is the so-called Gauss elimination , 
which we shall now discuss (proceeding independently of Sec . 7.3). 


Gauss Elimination 

This standard method for solving linear systems (1) is a systematic process of elimination 
that reduces (1) to “triangular form” because the system can then be easily solved by 
“back substitution.” For instance, a triangular system is 

3jvT| + 5.v 2 + 2*3 = 8 

8a*2 + 2a- 3 — “7 
6a* 3 = 3 

and back substitution gives x 3 = 3/6 = 1/2 from the third equation, then 

*2 = s(“ 7 “ 2*3) = 

from the second equation, and finally from the first equation 

-Vi = 1(8 - 5 at 2 - 2*3) = 4. 

How do we reduce a given system (1) to triangular form? In the first step we eliminate 
jc x from equation E 2 to E n in (1). We do this by adding (or subtracting) suitable multiples 
of E 1 from equations Eg, ■ • * , E n and taking the resulting equations, call them E%, • • , 
E£ as the new equations. The first equation, E 1? is called the pivot equation in this step, 
and a u is called the pivot This equation is left unaltered. In the second step we take the 
new second equation Eg (which no longer contains a' x ) as the pivot equation and use it to 
eliminate x 2 from E 3 to E^. And so on. After n — 1 steps this gives a triangular system 
that can be solved by back substitution as just shown. In this way we obtain precisely all 
solutions of the given system (as proved in Sec. 7.3). 

The pivot a kk (in step k) must be different from zero and should be large in absolute 
value, to avoid roundoff magnification by the multiplication in the elimination. For 
this we choose as our pivot equation one that has the absolutely largest a jk in column 
k on or below the main diagonal (actually, the uppermost if there are several such 
equations). This popular method is called partial pivoting. It is used in CASs (e.g., 
in Maple). 
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EXAMPLE 1 


Partial pivoting distinguishes it from total pivoting, which involves both row and 
column interchanges but is hardly used in practice. 

Let us illustrate this method with a simple example. 

Gauss Elimination. Partial Pivoting 

Solve the system 

Eli 8*2 + 2*3 = —7 

E 2 : 3 *x + 5*2 + 2*3 = 8 

E3: 6*1 ■+■ 2*2 + 8*3 = 26 . 

Solution . We must pivot since Ej has no .Vj-term. In Column 1, equation E 3 has the largest coefficient. 
Hence we interchange E A and E 3 , 

6* t + 2x 2 + 8*3 = 26 
3 *! + 5*2 + 2.Y3 = 8 

8*2 + 2**3 = - 7 . 

Step 1. Elimination of * a 

It would suffice to show the augmented matrix and operate on it. We show both the equations and the augmented 
matrix. In the first step, the first equation is the pivot equation. Thus 


Pivot 6 

— 2*2 8*3 = 26 

“6 

2 

8 1 
1 

26 " 

Eliminate — 

— >[3*^+ 5*2 + 2*3 = 8 

3 

5 

2 | 

8 


8*2 + 2*3 = —7 

.0 

8 

1 

2 l 

- 7 . 


To eliminate X\ from the other equations (here, from the second equation), do: 

Subtract 3/6 = 1/2 times the pivot equation from the second equation. 


The result is 


6*! 4 * 2*2 + 8*3 = 26 

"6 

2 

8 

l 

l 

26 " 

to 

1 

L? 

« 

ll 

Jn 

0 

4 

—2 

1 

1 

-5 

8*2 + 2*3 = -7 1 

.0 

8 

2 

1 

l 

— 7 _ 


Step 2. Elimination of * 2 

The largest coefficient in Column 2 is 8. Hence we take the new third equation as the pivot equation, interchanging 
equations 2 and 3 , 



6* a + 2*2 

+ 8*3 = 26 

"6 

2 

8 

1 

1 

26 ’ 

Pivot 8 



>+ 2*3 = -7 

0 

8 

2 

1 

1 

-7 

Eliminate — 

-> 

4*2 

1 

V 

to 

II 

1 

L/l 

.0 

4 

-2 

l 

l 

- 5 . 


To eliminate * 2 from the third equation, do: 

Subtract 1/2 times the pivot equation from the third equation. 

The resulting triangular system is shown below. This is the end of the forward elimination. Now comes the back 
substitution. 

Back substitution . Determination of x 3. * 2 , xj 
The triangular system obtained in Step 2 is 


6*1 + 2*2 + 8*3 — 26 

'6 

2 

S 

1 

1 

26" 

8*2 + 2*3 = —7 

0 

8 

2 

1 

l 

-7 

- 3*3 = -| 

.0 

0 

-3 

l 

l 

_2 

2-1 



836 


CHAP. 20 Numeric Linear Algebra 


EXAMPLE 2 


EXAMPLE 3 


From this system, taking the last equation, then the second equation, and Finally the first equation, we compute 
the solution 

*3 = 2 

*2 = b( -7 - 2.V 3 ) = ~ I 
-«i = e(26 ~ 2a - 2 — 8 * 3 ) = 4. 

This agrees with the values given above, before the beginning of the example. ■ 

The general algorithm for the Gauss elimination is shown in Table 20.1. To help explain 
the algorithm, we have numbered some of its lines, bj is denoted by a jtn+lJ for uniformity. 
In lines 1 and 2 we look for a possible pivot. [For k = 1 we can always find one; otherwise 
a*x would not occur in (1).] In line 2 we do pivoting if necessary, picking an a jk of greatest 
absolute value (the one with the smallest j if there are several) and interchange the 
corresponding rows. If \a kk \ is greatest, we do no pivoting. mj k in line 3 suggests multiplier , 
since these are the factors by which we have to multiply the pivot equation E k in Step k 
before subtracting it from an equation E* below Ejf from which we want to eliminate x k . 
Here we have written E^ and E* to indicate that after Step 1 these are no longer the given 
equations in (1), but these underwent a change in each step, as indicated in line 4. 
Accordingly, a jk etc. in lines 1-4 refer to the most recent equations, and j = k in line 1 
indicates that we leave untouched all the equations that have served as pivot equations in 
previous steps. For p = k in line 4 we get 0 on the right, as it should be in the elimination, 

ci jk 

a jk ~~ m jk a kk ~~ Gjk ~ a kk ~ 0 * 

a kk 

In line 5, if the last equation in the triangular system is 0 = ft* ¥= 0, we have no 
solution. If it is 0 = Z?* = 0, we have no unique solution because we then have fewer 
equations than unknowns. 

Gauss Elimination in Table 20.1, Sample Computation 

In Example I we had fl n = 0, so that pivoting was necessary. The greatest coefficient in Column I was a 3i . 
Thus/ = 3 in line 2. and we interchanged E x and E3. Then in lines 3 and 4 we computed rn 2 1 = 3/6 = \ and 

il 22 = 5 — 5 • 2 = 4 , ^23 = 2 — | * 8 = — 2 , a 24 = 8 — 5*26= — 5 , 

and then m 31 - 0/6 = 0, so that the third equation &v 2 + 1* 3 = -7 did not change in Step I. In Step 2 
( k = 2) we had 8 as die greatest coefficient in Column 2, hence / — 3. We interchanged equations 2 and 3, 
computed m 32 = — 4/8 = — § in line 4. and die « 33 = -2 — \ • 2 = —3, « 34 = -5 — |(~7) = — This 
produced the triangular form used in the back substitution. ■ 

If a kk = 0 in Step k , we must pivot If \a kk \ is small, we should pivot because of roundoff 
eiTor magnification that may seriously affect accuracy or even produce nonsensical results. 

Difficulty with Small Pivots 

The solution of the system 

0.0004a'! -h 1.402*2 = 1.406 
0.4003.*! - i.502v 2 = 2.501 

is .*! = 10. .* 2 = I. We solve this system by the Gauss elimination, using four-digit floating-point arithmetic. 
(4D is for simplicity. Make an 8D-arithmetic example that shows the same.) 

(a) Picking the first of the given equations as the pivot equation, we have to multiply this equation by 
m ~ 0-4003/0.0004 = 1001 and subtract the result from the second equation, obtaining 
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Table 20.1 Gauss Elimination 


ALGORITHM GAUSS (A = [a jk ] = [A b]) 

This algorithm computes a unique solution x = [xj] of the system (1) or indicates that 
(1) has no unique solution. 

INPUT: Augmented n X (n + 1) matrix A = [cij k ], where aj >n + 1 = bj 

OUTPUT: Solution x = [x^] of (1) or message that the system (1) has no 
unique solution 


1 


2 

3 


4 


For k = 1, • • • , n — 1, do: 

If a jk = 0 for all j ^ k then OUTPUT “No unique solution 
exists.” Stop 


End 


[Procedure completed unsuccessfully; A is singular] 

Else exchange the contents of rows J and k of A with J the smallest 
j ^ k such that \a^ k \ is maximum in column k . 

For j = k + I, • • • , /z, do: 


Mjk- = 


a jk 

a kk 


End 


For p = k + 1 , • • • , n 4- 1 , do: 

cij p : — cijp ~ mj k a k p 

End 


5 If a nn = 0 then OUTPUT “No unique solution exists.” 

Stop 
Else 


6 x n = a* 1 ' 71 * 1 [Start back substitution] 

For i = n — 1, do: 

7 x i = ~r (‘Wi - 2 

«« \ / 

End 

OUTPUT x = [x,]. Stop 
End GAUSS 


—1405*2 = -1404. 

Hence ,v 2 = — 1404/(— 1405) = 0.9993, and from the first equation, instead of jtj = 10, we get 

I 0.005 

" OOOM ('■«- I -«2 ■».»») - — - 12.5. 

This failure occurs because |a n | is small compared with |a 12 |, so that a small roundoff error in x 2 leads to a 
large error in x v 
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(b) Picking the second of the given equations as the pivot equation, we have to multiply this equation by 
0.0004/0.4003 = 0.000 9993 and subtract the result from the first equation, obtaining 

t.404.v 2 = 1.404. 

Hence .v 2 = 1. and from the pivot equation .Yj = 10. This success occurs because |fl 2 il Is not very small 
compared to |« 2 2 ^ so that a small roundoff error in .v 2 would not lead to a large error in x v Indeed, for 
instance, if we had the value .v 2 = 1.002, we would still have from the pivot equation the good value 
a*i = (2.501 + 1. 505 )/0 .4003 = 10.01. ■ 

Error estimates for the Gauss elimination are discussed in Ref. [E5] listed in App. 1. 

Row scaling means the multiplication of each Row j by a suitable scaling factor sj. It is 
done in connection with partial pivoting to get more accurate solutions. Despite much 
research (see Refs. [E9], [E24] in App. 1) and the proposition of several principles, scaling 
is still not well understood. As a possibility, one can scale for pivot choice only (not in 
the calculation, to avoid additional roundoff) and take as first pivot the entry a :}1 for which 
kji|/W * s here Aj is an entry of largest absolute value in Row j. Similarly in the 

further steps of the Gauss elimination. 

For instance, for the system 

4.0000a'! + 14020a 2 = 14060 
0.4003a'! - 1.502*2 = 2.501 

we might pick 4 as pivot, but dividing the first equation by 10 4 gives the system in Example 
3, for which the second equation is a better pivot equation. 

Operation Count 

Quite generally, important factors in judging the quality of a numeric method are 
Amount of storage 

Amount of time (= number of operations) 

Effect of roundoff error. 

For the Gauss elimination, the operation count for a full matrix (a matrix with relatively 
many nonzero entries) is as follows. In Step k we eliminate x k from n — k equations. This 
needs n — k divisions in computing the m jk (line 3) and ( n - k)(n - k + 1 ) multiplications 
and as many subtractions (both in line 4). Since we do n — 1 steps, k goes from 1 to 
n — 1 and thus the total number of operations in this forward elimination is 

n— 1 n— 1 

/(/?) = 2 (n — k) + 2 2 0* “ k)(n — k + 1) (write n — k = s) 

1 k=l 

7i — 1 n— 1 

= 2 * + 2 2 s ( s + 1) = ~ 1 )h + |(« 2 — 1 )n ** §/7 3 

S=1 s = 1 

where 2n 3 /3 is obtained by dropping lower powers of n. We see that f(n) grows about 
proportional to n 3 . We say that f(n) is of order n 3 and write 


f(n) = 0(n 3 ) 
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where O suggests order. The general definition of O is as follows. We write 

f[n) = 0{h(n)) 

if the quotient \f(n)lh(n)\ remains bounded (does not trail off to infinity) as n — » oo. In 
our present case, h{n) = n z and, indeed, f{n)!n z — » 2/3 because the omitted terms divided 
by n z go to zero as n — » sc. 

In the back substitution of ** we make n — i multiplications and as many subtractions, 
as well as 1 division. Hence the number of operations in the back substitution is 


n n 

b(n) = 2 ^ (n — /) 4- n = 2 2 s + n = n(n 4- 1) 4- n = n 2 4- 2 n = 0(n 2 ). 

i = 1 s=l 

We see that it grows more slowly than the number of operations in the forward elimination 
of the Gauss algorithm, so that it is negligible for large systems because it is smaller by 
a factor n, approximately. For instance, if an operation takes 10~ 9 sec, then the times 
needed are: 


Algorithm 

n = 1000 

n = 10000 

Elimination 

0.7 sec 

11 min 

Back substitution 

0.001 sec 

0.1 sec 


PROBLEM SET 20.1 


For applications of linear systems see Secs. 7.1 and 8.2. 


1-3 


GEOMETRIC INTERPRETATION 


Solve graphically and explain geometrically. 
1. 4.\'j 4- jc 2 = —4.3 


3a*! — 5jc 2 “ —33.7 


2. 1.820*! - 1.183a* 2 = 0 
-12.74*! 4- 8.281* 2 = 0 

3. 7.2*! - 3.5*2 = 16.0 

-21.6*! + 10.5*2 = -48.5 

|«l—14 1 GAUSS ELIMINATION 

Solve the following linear systems by Gauss elimination, 
with partial pivoting if necessary (but without scaling). Show 
the intermediate steps. Check the result by substitution. If no 
solution or more than one solution exists, give a reason. 

4. 6a*! -I- * 2 — 

4a* i 2* 2 — 6 


5. 2*! — 8*2 — -4 
6 *! + 2*2 = 14 

6. 25.38*i “ 15.48*2 = 30.60 
— 7.05*i 4- 4.30*2 = “8.50 

7. 6a* 2 + 13*3 = 137.86 

6*i - 8*3 = —85.88 

13*, - 8*2 = 178.54 

8. 5*, 4- 3*2 4- *3=2 

- 4*2 4- 8*3 = — 3 
10*i “ 6*2 + 26*3 = 0 

9. 4*, + 10* 2 - 2*3 = -20 

— *, — 15*2 4“ 3**3 = 30 

25*2 “ 5*3 = —50 




840 


CHAP. 20 Numeric Linear Algebra 


10. A'j + 2*2 - 3*3 =-11 

1 0*i + *2 + A*3 = 8 

10*2 + 2 * 3 = 2 

11. 3.4*! - 6.12*2 - 2.72*3 = 0 

-*! + 1.80*2 + 0.80*3 = 0 
2.7*! - 4.86*2 - 2.16*3 = 0 

12. 3*2 + 5*3 = 1.20736 

3*! - 4*2 = -2.34066 

5*! + 6*3 = -0.329193 

13. 6.4*! + 3.2*2 = —1.6 

3.2*! - 1.6* 2 + 4.8*3 = 32.0 

4.8*2 ~ 9.6* 3 + 7.2* 4 = -78.0 
7.2*3 + 4.8*4 = 20.4 

14. 4.4*2 + 3.0*3 ~ 6.6*4 = -4.65 

0.4*! + 3.6*2 + 8.4* 4 = 4.62 

—2.0*! - 6.2*2 + 5.0*3 = —4.35 

*! - 7.6*3 + 3.0*4 = 5.97 

15. CAS EXPERIMENT. Gauss Elimination. Write a 
program for the Gauss elimination with pivoting. 
Apply it to Probs. 1 1-14. Experiment with systems 
whose coefficient determinant is small in absolute 
value. Also investigate the performance of your 
program for larger systems of your choice, including 
sparse systems. 

16. TEAM PROJECT. Linear Systems and Gauss 
Elimination, (a) Existence and uniqueness. Find a and 
b such that ax x + * 2 = b, x x + * 2 = 3 has (i) a unique 
solution, (ii) infinitely many solutions, (iii) no solutions. 


(b) Gauss elimination and nonexistence. Apply the 
Gauss elimination to the following two systems and 
compare the calculations step by step. Explain why the 
elimination fails if no solution exists. 

*i + * 2 + * 3 = 3 
4*! + 2*2 - *3 = 5 

9 *! + 5*2 - *3 = 1 3 

*1 + *2 + A*3 = 3 
4*! + 2*2 - X 3 = 5 
9*1 + 5*2 — *3 12. 

(c) Zero determinant. Why may a computer program 
give you the result that a homogeneous linear system 
has only the trivial solution although you know its 
coefficient determinant to be zero? 

(d) Pivoting. Solve System (A) (below) by the Gauss 
elimination first without pivoting. Show that for any 
fixed machine word length and sufficiently small e > 0 
the computer gives * 2 = l and then * t = 0. What is 
the exact solution? Its limit as e — » 0? Then solve the 
system by the Gauss elimination with pivoting. 
Compare and comment. 

(e) Pivoting. Solve System (B) by the Gauss 
elimination and three-digit rounding arithmetic, 
choosing (i) the first equation, (ii) the second equation 
as pivot equation. (Remember to round to 3S after each 
operation before doing the next, just as would be done 
on a computer!) Then use four-digit rounding arithmetic 
in those two calculations. Compare and comment. 

(A) €*i +*2=1 

*! +*2 = 2 

(B) 4.03*i + 2.16*2 = -4.61 

6.21*i + 3.35*2 = -7.19 


20u Linear Systems: LU-Factorization, 
Matrix Inversion 


We continue our discussion of numeric methods for solving linear systems of n equations 
in n unknowns x l9 • • • , * n , 

(1) Ax = b 

where A = [a jk ] is the n X n coefficient matrix and x T = • * • * n ] and b T = [frj • • • b n ]. 

We present three related methods that are modifications of the Gauss elimination, which 
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EXAMPLE 1 


require fewer arithmetic operations. They are named after Doolittle, Crout, and Cholesky 
and use the idea of the LU-factorization of A, which we explain first. 

An LU-factorization of a given square matrix A is of the form 

(2) A = LU 

where L is lower triangular and U is upper triangular . For example, 


~2 3“ 


"1 0“ 


“2 


= LU = 




_8 5. 


_4 l_ 


_0 


It can be proved that for any nonsingular matrix (see Sec. 7.8) the rows can be reordered 
so that the resulting matrix A has an LU-factorization (2) in which L turns out to be the 
matrix of the multipliers mj k of the Gauss elimination, with main diagonal l, • • • , 1, and 
U is the matrix of the triangular system at the end of the Gauss elimination. (See Ref. 
[E5], pp. 155-156, listed in App. 1.) 

The crucial idea now is that L and U in (2) can be computed directly, without solving 
simultaneous equations (thus, without using the Gauss elimination). As a count shows, this 
needs about n 3 / 3 operations, about half as many as the Gauss elimination, which needs about 
2 ;i 3 /3 (see Sec. 20.1). And once we have (2), we can use it for solving Ax = b in two steps, 
involving only about n 2 operations, simply by noting that Ax = LUx = b may be written 

(3) (a) Ly = b where (b) Ux = y 

and solving first (3a) for y and then (3b) for x. Here we can require that L have main diagonal 
1 , • • • , 1 as stated before; then this is called Doolittle’s method. Both systems (3a) and 
(3b) are triangular, so we can solve them as in the back substitution for the Gauss elimination. 

A similar method, Crout’s method, is obtained from (2) if U (instead of L) is required 
to have main diagonal 1 , • • • , 1 . In either case the factorization (2) is unique. 


Doolittle’s Method 

Solve the system in Example I of Sec. 20. 1 by Doolittle’s method. 
Solution . The decomposition (2) is obtained from 



~ *11 

*12 

*13 


'3 

5 

2 " 


"1 

0 

O' 


’*11 

*12 

*13 

A = M = 

*21 

*22 

*23 

= 

0 

8 

2 

= 

**21 

1 

0 


0 

*22 

*23 


_ *31 

*32 

* 33 . 


_ 6 

2 

S_ 


.**31 

**32 



_0 

0 

* 33 . 


by determining the nij k and Uj k . using matrix multiplication. By going through A row by row we get successively 


*11 - 3 - 1 * « n - lin 

a 12 = 5 = 1 • n 12 = u 12 

*13 - 2 - l • *13 - *13 

“21 = 0 = W 21 H 11 

*22 = 8 = **21*12 + *22 

*23 = 2 = **21*13 + *23 

**21 = 0 

*22 ~ 8 

**23 = ~ 

“31 = 6 = nigjH] j 

*32 = 2 = W31M12 + **32*22 

“33 = 8 = «l3i«i3 + HI 3 2 «23 + «33 

= »i 3 i • 3 

= 2 * 5 /W32 * 8 

— 2 • 2 — 1-2 + 1133 


"'31 “ 2 "'32 = "I II 33 = 6 
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Thus the factorization (2) is 


"3 5 2" 


0 

0 


‘3 5 2’ 

0 8 2 

= LE = 

0 1 0 


0 8 2 

.6 2 8. 


.2 -1 1_ 


\o 

0 

0 


We first solve Ly = b, determining \ r i = 8, then y 2 = -7, then v 3 from 2vj - y 2 + .V3 = 16 + 7 + y 3 = 26; 
thus (note the interchange in b because of the interchange in A!) 


“1 0 0" 


>r 


" 8' 


‘ 8“ 

0 l 0 


)'2 

= 

-7 

Solution y = 

-7 

2 -1 1_ 


-^3. 


. 26 . 


3_ 


Then we solve Ux = y, determining .v 3 = 3/6, then x 2 , then Xi> that is. 


‘3 5 2“ 


V 


" 8“ 


" 4“ 

0 8 2 


x 2 

= 

-7 

Solution x == 

-1 

0 0 6_ 




3_ 


J /2 - 


This agrees with the solution in Example 1 of Sec. 20.1. M 


Our formulas in Example 1 suggest that for general n the entries of the matrices 
L = [i m jk ] (with main diagonal 1, • • • , 1 and m jk suggesting “multiplier”) and U = [u jk ] 
in the Doolittle method are computed from 


U 1 k “ a lk 


A: = 1, • • • , n 

a jl 

mj 1 = 

w n 

j-i 

j = 2, • • • , n 

u jk ~ a jk ~ 

X m is u sk 
1 

k=j ,••• , n; 

1 

nu k = 


j = k + 1, • • 


Row Interchanges. Matrices, such as 


"0 

r 


'0 

r 



or 



J 

1. 


.1 

o_ 


have no LU-factorization (try!). This indicates that for obtaining an LU-factorization, row 
interchanges of A (and corresponding interchanges in b) may be necessary. 


Cholesky's Method 

For a symmetric, positive definite matrix A (thus A = A T , x T Ax > 0 for all x 0 ) we 
can in (2) even choose U = L T , thus u jk = m kj (but cannot impose conditions on the main 
diagonal entries). For example. 



"4 2 14' 


O 

O 

<N 


r- 

— i 

CM 

(5) A = 

2 17 -5 

= LL t = 

1 4 0 


0 4-3 


.14 -5 83_ 


_7 -3 5_ 


1 

O 

O 

1 
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EXAMPLE 2 


The popular method of solving Ax = b based on this factorization A = LL T is called 
Cholesky’s method. In terms of the entries of L = [l jk ] the formulas for the factorization 

are 

1 

II 

H 

ft 



, _ a Jl 

ljl ~ Ti 

j = 2, ,n 

(6) 

1 

hi ~ \ a H ~ hs 

j = 2 , • • • , n 


1 / j-1 \ 

ipj ~r~ I 2 ijs^ps ) 

hi \ 5-1 / 

P = j + 1, • • * , n; j S 2. 


If A is symmetric but not positive definite, this method could still be applied, but then 
leads to a complex matrix L, so that the method becomes impractical. 


Cholesky’s Method 

Solve by Cholesky’s method: 


4a x + lv 2 + 14 a'3 = 14 

2x x + 1 7a- 2 — 5a 3 = —101 
14a 1 ! - 5a 2 + 83a- 3 = 155, 

Solution . From (6) or from the form of the factorization 


" 4 

2 

14" 


"/it 

0 

0 " 


2 

17 

-5 

= 

hi 

I22 

0 


_i4 

-5 

83 _ 


J31 

hz 

hz. 



we compute, in the given order, 


hi 

hz 

0 


/ u = = 2 


on 


, - f2L - 1 - , 
'21 _ , ~ 2 " 1 

*11 1 


I _ 

hi ~ T“ 
hi 


I 99 — ^22 — hi Z ~~ Vl7 — 1 — 4 


hi 

hz 

hz_ 



7 


hz ~ . («32 ~ hihi) - T (~ 5 ~ 7 * 1 ) - -3 

• 22 ^ 


^33 — ^^33 ~ ^3i 2 — h^ ~ ^83 — 7 2 — (— 3) 2 — 5. 
This agrees with (5). We now have to solve Ly = b, that is. 


— 1 

to 

0 

0 

1 


>1" 


‘ 14“ 


‘ 7“ 

1 4 0 


.V2 

= 

-101 

Solution y = 

-27 

1 -3 5_ 


_J3_ 


155_ 


- 5 - 


As the second step, we have to solve Ux = L T x = y, that is. 


"2 1 T 


w 


" 7" 


" 3" 

0 4-3 




-27 

Solution x = 

-6 

_0 0 5_ 


*3. 


. 5 - 


l_ 
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THEOREM 


Stability of the Cholesky Factorization 

The Cholesky Ll7 -factorization is numerically stable (as defined in Sec. 19.1). 


PROOF We have ajj = lj 2 4 lj 2 2 4* ■ • • 4 lj 2 by squaring the third formula in (6) and solving it 
for ajj. Hence for all l jk (note that l jk = 0 for k > j) we obtain (the inequality being trivial) 

l jk 2 S l jx 2 + l j2 2 + • • • + = Oy. 

That is, /j fc 2 is bounded by an entry of A, which means stability against rounding. ■ 

Gauss-Jordan Elimination. Matrix Inversion 

Another variant of the Gauss elimination is the Gauss-Jordan elimination, introduced 
by W. Jordan in 1920, in which back substitution is avoided by additional computations 
that reduce the matrix to diagonal form, instead of the triangular form in the Gauss 
elimination. But this reduction from the Gauss triangular to the diagonal form requires 
more operations than back substitution does, so that the method is disadvantageous for 
solving systems Ax = b. But it may be used for matrix inversion, where the situation is 
as follows. 

The inverse of a nonsingular square matrix A may be determined in principle by solving 
the n systems 

(7) Ax = bj 0‘ = 1, • • • , n ) 

where is the jth column of the n X n unit matrix. 

However, it is preferable to produce A -1 by operating on the unit matrix I in the same 
way as the Gauss-Jordan algorithm, reducing A to I. A typical illustrative example of this 
method is given in Sec. 7.8. 


EHEQBBEEKfc 53E* 20. Z 


[U7| DOOLITTLE’S METHOD 

Show the factorization and solve by Doolittle’s method. 

1. 3x x 4 2a * 2 =15.2 

15*! 4 1 1*2 = 77.3 

2 . 2 *! + 9 a - 2 = 41 

3*! 5*2 ~ 31 

3. 4*j - 6*2 = -34 

8*j — 7*2 = — 53 

4. 2*! -f- *2 + 2*3 = 0 
~2*i + 2*2 4- * 3 = 0 

*1 “I - 2*2 2*3 = 36 


5. 6* x + 4*2 + 3*3 = 2.0 
4*! + 3*2 + 2*3 = 0.5 
3* x + 4*2 + 2*3 = —2.5 

6. *x — * 2 + 2.6*3 = — 9.88 

0.5*! — 3.0*2 + 3.3*3 = —16.54 

-1.5*! - 3.5*2 “ 10.4*3 = 21.02 

7. 3*| + 9*2 4 6*3 = 2.3 

18*! 4 48*2 + 39*3 = 13.6 

9 *! - 27*2 4 42*3 = 4.5 

8. TEAM PROJECT. Crout’s method factorizes 
A = LU, where L is lower triangular and U is upper 
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triangular with diagonal entries ujj = 1 J = 1, * * * , n, 

(a) Formulas. Obtain formulas for Crout’s method 
similar to (4). 

(b) Examples. Solve Probs. I and 7 by Crout’s method. 

(c) Factor the following matrix by the Doolittle, 
Crout, and Cholesky methods. 



(d) Give the formulas for factoring a tridiagonal 
mauix by Crout’s method. 

(e) When can you obtain Crout’s factorization from 
Doolittle’s by transposition? 

9— 13 1 CHOLESKY’S METHOD 

Show the factorization and solve. 

9. 9x x 4 6* 2 4 12.v 3 = 87 

6x1 4* 1 3* 2 4- 1 1a*3 = 118 
12a-! 4 I 1a- 2 4 26.Y3 = 154 

10. 0.04*! 4- 0.12*3 = 1.4 

0.64* 2 4 0.32*3 = 1.6 

0.12*i + 0.32*2 + 0.56*3 = 5.4 

11. 4*! 4- 6*2 4- 8*3 = 0 

6 *! - 1 - 34*2 + 52*3 = ”80 

8*! 4 52*2 -4 129*3 = -226 

12. *1 - * 2 4 3*3 4 2*4 = 30 

— *1 4- 5*2 ~ 5*3 - 2*4 = -70 

3*i — 5*2 + 19*3 + 3*4 = 188 

2*i — 2*2 4 3*3 4 21*4 = 2 


845 

13. 4*! 4 2* 2 + 4*3 - 20 

2*i 4 2*2 4 3*3 4 2*4 = 36 

4*! 4 3*2 4 6*3 4 3*4 = 60 

2*2 4 3*3 4 9.V4 = 1 22 

14. CAS PROJECT. Cholesky’s Method, (a) Write a 
program for solving linear systems by Cholesky’s 
method and apply it to Example 2 in the text, to Probs. 
9-1 1, and to systems of your choice. 

(b) Splines. Apply the factorization part of the 
program to the following matrices (as they occur in (9), 
Sec. 19.4 (with cj = 1), in connection with splines). 



15. (Definiteness) Let A and B be positive definite n X n 
matrices. Are - A, A T . A 4 B, A — B positive definite? 

1 16-19 1 INVERSE 

Find the inverse by the Gauss-Jordan method, showing the 
details. 

16. In Prob 4. 

17. In Prob. 5. 

18. In Prob. 6. 

19. In Prob. 7. 

20. (Rounding) For the following matrix A find det A. 
What happens if you round off the given entries to (a) 
5S, (b) 4S. (c) 3S, (d) 2S, (e) IS? What is the practical 
implication of your work? 

" 1/3 1/4 2 

A = -1/9 1 1/7 

. 4/63 -3/28 13/49. 


20.3 Linear Systems: Solution by Iteration 

The Gauss elimination and its variants in the last two sections belong to the direct methods 
for solving linear systems of equations; these are methods that give solutions after an 
amount of computation that can be specified in advance. In contrast, in an indirect or 
iterative method we start from an approximation to the true solution and, if successful, 
obtain better and better approximations from a computational cycle repeated as often as 
may be necessary for achieving a required accuracy, so that the amount of arithmetic 
depends upon the accuracy required and varies from case to case. 



846 


CHAP. 20 Numeric Linear Algebra 


EXAMPLE 1 


We apply iterative methods if the convergence is rapid (if matrices have large main 
diagonal entries, as we shall see), so that we save operations compared to a direct method. 
We also use iterative methods if a large system is sparse, that is, has very many zero 
coefficients, so that one would waste space in storing zeros, for instance, 9995 zeros per 
equation in a potential problem of 10 4 equations in I0 4 unknowns with typically only 5 
nonzero terms per equation (more on this in Sec. 21.4). 

Gauss-Seidel Iteration Method 1 

This is an iterative method of great practical importance, which we can simply explain in 
terms of an example. 

Gauss-Seidel Iteration 

We consider the linear system 


(I) 


Xl - 0.25.v 2 - 0.25*3 - 50 

-0.25*! + * 2 - 0.25*4 = 50 

—0.25*! + *3 — 0.25*4 = 25 

- 0.25*2 — 0.25*3 + *4 = 25. 


(Equations of this form arise in the numeric solution of PDEs and in spline interpolation.) We write the system 
in the form 


( 2 ) 


a*! — 0.25*2 0.25*3 50 

* 2 = 0.25*! + 0.25*4 + 50 


*3 = 0.25*i 


+ 0.25*4 + 25 


* 4 = 0.25*2 + 0-25*3 + 25. 


These equations are now used for iteration; that is, we start from a (possibly poor) approximation to the solution, 
say * ( ! 0) = 100 . * 2 0> = 100 . * 3 0) = 100 . * 4 0) = 100 . and compute from ( 2 ) a perhaps better approximation 


Use "old" values 

("New" values here not yet available) 

i 


4»= 


0.254 01 + 

0.254°’ 


+ 50.00 = 100.00 


0.254" 



0.254°’ 

+ 50.00 = 100.00 

■?*- 

0.254" 



0.254°’ 

+ 25.00 = 75.00 



0.254" + 

0.254” 


+ 25.00 = 68.75 


t 

Use "new" values 


These equations (3) are obtained from (2) by substituting on the right the most recent approximation for each 
unknown. Fn fact, corresponding values replace previous ones as soon as they have been computed, so that in 


PHILIPP LUDWIG VON SEIDEL (1821-1896). German mathematician. For Gauss see footnote 5 in 


Sec. 5.4. 
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the second and third equations we use 4“ (not x i° 5 ), and in the last equation of (3) we use 4 “ and 4 “ (not 
4° 5 and a-® 5 ). Using die same principle, we obtain in the next step 

4 2> = 0.25.4° + 0.254° + 50.00 = 93.750 

4 2> = 0.25.vf 1 + 0.254“ + 50.00 = 90.625 

4 2) = 0.25a:® 5 + 0.254“ + 25.00 = 65.625 

4 2> = 0.25.4 25 + 0.254 25 + 25.00 = 64.062 

Further steps give the values 


*1 

x 2 

*3 

A* 4 

89.062 

88.281 

63.281 

62.891 

87.891 

87.695 

62.695 

62.598 

87.598 

87.549 

62.549 

62.524 

87.524 

87.512 

62.512 

62.506 

87.506 

87.503 

62.503 

62.502 


Hence convergence 10 the exact solution Aj = x 2 — 87.5, x 3 — .v 4 = 62.5 (verify!) seems rather fast. H 

An algorithm for the Gauss-Seidel iteration is shown on the next page. To obtain the 
algorithm, let us derive the general formulas for this iteration. 

We assume that a# = 1 for j = 1, ■ • • , n. (Note that this can be achieved if we can 
rearrange the equations so that no diagonal coefficient is zero; then we may divide each 
equation by the corresponding diagonal coefficient.) We now write 

(4) A = I + L + U (ajj = 1) 

where I is the n X n unit matrix and L and U are respectively lower and upper triangular 
matrices with zero main diagonals. If we substitute (4) into Ax = b, we have 

Ax = (I + L + U)x = b. 

Taking Lx and Ux to the right, we obtain, since lx = x, 

(5) x = b - Lx — Ux. 

Remembering from (3) in Example 1 that below the main diagonal we took “new” 
approximations and above the main diagonal “old” ones, we obtain from (5) the desired 
iteration formulas 

”New M "Okr 

! i 

(6) x (TO+1) = b - Lx (m+1> - Ux (m) ( ajj = 1) 

where x (m> = [4" 5 ] is the mth approximation and x (m+15 = [4/’ l+15 ] is the (m + l)st 
approximation. In components this gives the formula in line 1 in Table 20.2. The matrix 
A must satisfy ajj ^ 0 for ally. In Table 20.2 our assumption cijj = 1 is no longer required, 
but is automatically taken care of by the factor 1 /a# in line 1. 
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Table 20.2 Gauss-Seidel Iteration 


ALGORITHM GAUSS-SEIDEL (A, b, x (0 \ e, AO 


This algorithm computes a solution x of the system Ax = b given an initial approximation 
x (0) . where A = [aj k ] is an n X n matrix with a# =£ 0, j = 1 ,***,//. 

INPUT: A, b, Initial approximation x C0) , tolerance e > 0, maximum number 
of iterations N 

OUTPUT : Approximate solution x (m> = [a* ( / 1) ] or failure message that x ov> does 
not satisfy the tolerance condition 


1 


2 


For m = 0, • • • , N - L do: 


For j = !,•••,/?, do: 



End 


if max \\f'+ l) - < e then OUTPUT x (w+1) . Stop 

[Procedure completed successfully ] 


End 

OUTPUT: “No solution satisfying the tolerance condition obtained after N 

iteration steps.” Stop 
[Procedure completed unsuccessfully] 

End GAUSS-SEIDEL 


Convergence and Matrix Norms 

An iteration method for solving Ax = b is said to converge for an initial x (0) if the 
corresponding iterative sequence x (0) , x (1) , x (2) , • • • converges to a solution of the given 
system. Convergence depends on the relation between x (m) and x (m+1) . To get this relation 
for the Gauss-Seidel method, we use (6). We first have 

(I + L)x <m+1> = b - Ux (w> 

and by multiplying by (I + L)“ x from the left, 

(7) x (,u+1) = Cx (m) + (I + L) -1 b where C = -(I + L) -1 U. 

The Gauss-Seidel iteration converges for every x (0> if and only if all the eigenvalues 
(Sec. 8.1) of the “iteration matrix” C = [c jk ] have absolute value less than 1. (Proof in 
Ref. [E5], p. 191, listed in App. I.) 

CAUTION! If you want to get C, first divide the rows of A by a to have main diagonal 
1, • • - , 1. If the spectra] radius of C (= maximum of those absolute values) is small, 
then the convergence is rapid. 

Sufficient Convergence Condition* A sufficient condition for convergence is 


( 8 ) 


IICII < 1. 
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EXAMPLE 2 


Here ||C|| is some matrix norm, such as 

( 9 ) IICH = Jsi c jk 2 

Vi=i/c=i 

or the greatest of the sums of the \cj k \ in a column of C 


(Frobenius norm) 


(10) 


IICII = max 2 kjfel 


(Column “sum” norm) 


i=i 


or the greatest of the sums of the in a row of C 

n 

(11) ||C|| = max 2 kjfcl 


(Row “sum” norm). 


fc- 1 


These are the most frequently used matrix norms in numerics. 

In most cases the choice of one of these norms is a matter of computational convenience. 
However, the following example shows that sometimes one of these norms is preferable 
to the others. 

Test of Convergence of the Gauss-Seidel Iteration 

Test whether the Gauss-Seidel iteration converges for the system 
2v 4- y 4- z = 4 
x 4-2 !y 4- z = 4 written 

*+ y + 2z = 4 z = 2-ix-y. 

Solution. The decomposition (multiply the matrix by 1/2 — why?) is 


•v = 2 - iv - iz 

y = 2 — 5 -v — §z 


" I 

1/2 

1/2“ 


' 0 

0 

0“ 


"0 

1/2 

1/2" 

1/2 

1 

1/2 

=I+L+U=I+ 

1/2 

0 

0 

4- 

0 

0 

1/2 

_ 1/2 

1/2 

1 _ 


J/2 

1/2 

0. 


.0 

0 

0 _ 


It shows that 



1 

0 

0“ 


“0 

1/2 

1/2’ 


"0 

-1/2 

-1/2’ 

C = -(I + L) -1 U = - 

-1/2 

1 

0 


0 

0 

1/2 

= 

0 

1/4 

-1/4 


1/4 

-1/2 

L 


.0 

0 

0 _ 


.0 

1/8 

3/8. 


We compute the Frobenius norm of C 

„ / 1 I 1 I 1 9 \ 1/2 / 50 \ 1/2 

,C| -(7*4*76 *16 ■"«*«) -(«) -™ 4<l 

and conclude from (8) that this Gauss-Seidel iteration converges. It is interesting that the other two norms would 
permit no conclusion, as you should verify. Of course, this points to the fact dial (8) is sufficient for convergence 
rather than necessary. ■ 


Residual. Given a system Ax = b, the residual r of x with respect to this system is 
defined by 


( 12 ) 


r = b — Ax. 
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Clearly, r = 0 if and only if x is a solution. Hence r ^ 0 for an approximate solution. In 
the Gauss-Seidel iteration, at each stage we modify or relax a component of an 
approximate solution in order to reduce a component of r to zero. Hence the Gauss-Seidel 
iteration belongs to a class of methods often called relaxation methods. More about the 
residual follows in the next section. 

Jacobi Iteration 

The Gauss-Seidel iteration is a method of successive corrections because for each 
component we successively replace an approximation of a component by a corresponding 
new approximation as soon as the latter has been computed. An iteration method is called 
a method of simultaneous corrections if no component of an approximation x (w) is used 
until all the components of x (m) have been computed. A method of this type is the Jacobi 
iteration, which is similar to the Gauss-Seidel iteration but involves not using improved 
values until a step has been completed and then replacing x (m) by x <w '*’ 1) at once, directly 
before the beginning of the next step. Hence if we write Ax = b (with = 1 as before!) 
in the form x = b + (I — A)x, the Jacobi iteration in matrix notation is 

(13) x (m+1) = b 4 (I - A)x (w) (ajj = 1). 

This method converges for every choice of x (0) if and only if the spectral radius of I — A 
is less than 1. It has recently gained greater practical interest since on parallel processors 
all n equations can be solved simultaneously at each iteration step. 

For Jacobi, see Sec. 10.3. For exercises, see the problem set. 





1. Verify the claim at the end of Example 2. 

2. Show that for the system in Example 2 the Jacobi 
iteration diverges. Hint Use eigenvalues. 

|£5] GAUSS-SEIDEL ITERATION 

Do 5 steps, starting from x 0 = [1 I 1] T and using 6S in 
the computation. Hint Make sure that you solve each 
equation for the variable that has the largest coefficient 
(why?). Show the details. 

3. x x 4 .v 2 4 6.v 3 = —61.3 

x i 4- 9 A " 2 2a*3 = 49.1 

8.V| 4 2a*2 a*3 — 185.8 

4. a’ 2 + lx 3 = 25.5 

5a*! 4 a* 2 — 0 

Ax 4 6 a* 2 4 A*3 = “10.5 

5. 5a*x -4 a* 2 4 2a* 3 = 19 

a*i 4 4a 2 - 2a 3 = -2 

2a*j 4 3a* 2 *4 8a* 3 = 39 


6. 4a*x a* 2 — 21 

-A*! + 4a*2 - a*3 = -45 

— a * 2 4 4a 3 = 33 

7. 1 0.V] 4 A* 2 4 A* 3 = 6 

A*! 4 10a 2 + A 3 = 6 

A'i 4 A* 2 4 1 0a* 3 = 6 

8. 4x x 4 5a* 3 = 12.5 

Xi 4 6a 2 4 2a* 3 = 18.5 

8ax 4 2a* 2 4 a* 3 = — 1 1 .5 

9. Apply the Gauss-Seidel iteration (3 steps) to the 
system in Prob. 7, starting from (a) 0, 0, 0, (b) 10, 10, 

10. Compare and comment. 

10. In Prob. 7, compute C (a) if you solve the first equation 
for a*!, the second for a* 2 , the third for a* 3 , proving 
convergence; (b) if you nonsensically solve the third 
equation for a*j, the first for a* 2 , the second for a* 3 , 
proving divergence. 
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11. CAS PROJECT. Gauss-Seidel Iteration, (a) Write 
a program for Gauss-Seidel iteration. 

(b) Apply the program to A(r)x = b, starting from 
[0 0 Of, where 

‘l t 1 1 I” 2” 

A(/) = t 1 t , b = 2 . 

_t t ij L 2 . 

For / = 0.2, 0.5. 0.8, 0.9 determine the number of steps 
to obtain the exact solution to 6S and the corresponding 
spectral radius of C. Graph the number of steps and 
the spectral radius as functions of / and comment. 

(c) Successive overreiaxation (SOR). Show that by 
adding and subtracting x° M) on die right, formula (6) 
can be written 

X<m + 1) = x O»> + 5 _ Lx <m *‘* 1) - (U + I)X (m) 

(Ojj = 1 ). 

Anticipation of further corrections motivates the 
introduction of an overreiaxation factor co > 1 to get 
the SOR formula for Gauss-Seidel 

x (m+ i ) = x <m> + co ( b - Lx <w+1) 

(14) 

- (U + I)x<"°) (ajj = 1) 

intended to give more rapid converg ence. A 
recommended value is (o = 2/(1 + V 1 — p), where p 
is the spectral radius of C in (7). Apply SOR to the 
matrix in (b) for t = 0.5 and 0.8 and notice the 


improvement of convergence. (Spectacular gains are 
made with larger systems.) 

1 2— 1 5 1 JACOBI ITERATION 

Do 5 steps, starting from x 0 = [1 1 l] T . Compare with 

the Gauss-Seidel iteration. Which of the two seems to 
converge faster? (Show the details of your work.) 

12. The system in Prob. 6 

13. The system in Prob. 5 

14. The system in Prob. 8 

15. Show convergence in Prob. 14 by verifying that I - A, 
where A is the matrix in Prob. 14 with the rows divided 
by the corresponding main diagonal entries, has the 
eigenvalues -0.519589 and 0.259795 ± 0.246603/. 

16-20 1 NORMS 

Compute the norms (9), (10), (II) for the following (square) 
matrices. Comment on the reasons for greater or smaller 
differences among the three numbers. 

16. The matrix in Prob. 3 

17. The matrix in Prob. 7 

18. The matrix in Prob. 8 



L 17 -12 -2J 


20.4 Linear Systems: Ill-Conditioning, Norms 

One does not need much experience to observe that some systems Ax = b are good, 
giving accurate solutions even under roundoff or coefficient inaccuracies, whereas others 
are bad, so that these inaccuracies affect die solution strongly. We want to see what is 
going on and whether or not we can “trust” a linear system. Let us first formulate the two 
relevant concepts (ill- and well-conditioned) for general numeric work and then turn to 
linear systems and matrices. 

A computational problem is called ill-conditioned (or ill-posed ) if “small” changes in 
the data (the input) cause “large” changes in the solution (the output). On the other hand, 
a problem is called well-conditioned (or well-posed) if “small” changes in the data cause 
only “small” changes in the solution. 

These concepts are qualitative. We would certainly regard a magnification of 
inaccuracies by a factor 100 as “large,” but could debate where to draw the line between 
“large” and “small,” depending on the kind of problem and on our viewpoint. Double 
precision may sometimes help, but if data are measured inaccurately, one should attempt 
changing the mathematical setting of the problem to a well-conditioned one. 
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EXAMPLE 1 


Let us now turn to linear systems. Figure 442 explains that ill-conditioning occurs if 
and only if the two equations give two nearly parallel lines, so that their intersection point 
(the solution of the system) moves substantially if we raise or lower a line just a little. 
For larger systems the situation is similar in principle, although geometry no longer helps. 
We shall see that we may regard ill-conditioning as an approach to singularity of the 
matrix. 




Fig. 442. (a) Well-conditioned and (b) ill-conditioned 

linear system of two equations in two unknowns 


An Ill-Conditioned System 

You may verify that the system 

0.9999* - l.OOOly = I 
* — y = 1 

has the solution * = 0.5, y = -0.5, whereas the system 

0.9999* - l.OOOly = 1 

* - y = I + € 

has the solution* = 0.5 + 5000.5 c,y = -0.5 4999.5c. This shows that the system is ill-conditioned because 

a change on the right of magnitude e produces a change in the solution of magnitude 5000e. approximately. We 
see that the Lines given by the equations have nearly the same slope. M 

Well-conditioning can be asserted if the main diagonal entries of A have large absolute 
values compared to those of the other entries. Similarly if A” 1 and A have maximum 
entries of about the same absolute value. 

Hi-conditioning is indicated if A" 1 has entries of large absolute value compared to those 
of the solution (about 5000 in Example 1) and if poor approximate solutions may still 
produce small residuals. 

Residual. The residual r of an approximate solution x of Ax = b is defined as 
(1) r = b — Ax. 


Now b = Ax, so that 

(2) r = A(x - x). 


Hence r is small if xhas high accuracy, but the converse may be false: 
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EXAMPLE 2 


Inaccurate Approximate Solution with a Small Residual 

The system 

1.0001a*! + a*2 = 2.0001 

xx + I.0001.V2 = 2.0001 


has the exact solution a*j = 1. .v 2 = 1. Can you see this by inspection? The very inaccurate approximation 
.vj = 2.0000. = 0.0001 has the very small residual (to 4D) 


“2.00011 [l.l 

.2.000 1 J Ll.i 


0001 

,0000 


1.0000 

1.0001 


[2.0000“ 

“2.0001" 

“2.0003" 

" -0.0002" 

. L0.000L 

.2.0001. 

.2.0001. 

. 0.0000. 


From this, a naive person might draw the false conclusion that the approximation should be accurate to 3 or 4 
decimals. 

Our result is probably unexpected, but we shall see that it has to do with the fact that the system is 
ill-conditioned. I 


Our goal is to show that ill-conditioning of a linear system and of its coefficient matrix 
A can be measured by a number, the "condition number ” k( A). Other measures for 
ill-conditioning have also been proposed, but k( A) is probably the most widely used one. 
k( A) is defined in terms of norm, a concept of great general interest throughout numerics 
(and in modern mathematics in general !). We shall reach our goal in three steps, discussing 

1. Vector norms 

2. Matrix norms 

3. Condition number k of a square matrix. 


Vector Norms 

A vector norm for column vectors x = [xj\ with n components (/x fixed) is a generalized 
length or distance. It is denoted by ||x|| and is defined by four properties of the usual 
length of vectors Ln three-dimensional space, namely, 

(a) ||x|| is a nonnegative real number. 

(b) ||x|| =0 if and only if x = 0. 

(c) \\kx\\ = |£| || x || for all k. 

(d) ||x + y || S ||x|| + ||y|| (Triangle inequality). 

If we use several norms, we label them by a subscript. Most important in connection with 
computations is the p-norm defined by 

(4) ||x|| p = (\xi\ p + |* 2 | p + ••• + |a-J p ) 1/p 

where p is a fixed number and p ^ 1. In practice, one usually takes p = 1 or 2 and, as a 
third norm, Hxjl^ (the latter as defined below), that is. 


(5) 

Mi 

- kl + • • 

• + kl 

Oi- norm”) 

(6) 

M. 

= + • 

. . + V 2 
' A n 

(“Euclidean” or “/ 2 -norm”) 

(7) 

IML 

= max I*, | 


(“/ae-norm”). 
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EXAMPLE 3 


For n = 3 the / 2 -norm is the usual length of a vector in three-dimensional space. The 
l v norm and /o©-norm are generally more convenient in computation. But all three norms 
are in common use. 

Vector Norms 

If x T = [2 -3 0 L -4], then Hxllx = 10, ||x|| 2 = V30, Mt* = 4. ■ 

In three-dimensional space, two points with position vectors x and x have distance |x - x| 
from each other. For a linear system Ax = b, this suggests that we take ||x — x|| as a 
measure of inaccuraty and call it the distance between an exact and an approximate solution, 
or the error of x. 

Matrix Norm 

If A is an n X n matrix and x any vector with n components, then Ax is a vector with n 
components. We now take a vector norm and consider ||x|| and ||Ax||. One can prove (see 
Ref. [E17]. p. 77, 92-93, listed in App. 1) that there is a number c (depending on A) such 
that 

(8) ||Ax|| ^ c||x|| for all x. 

Let x =£ 0. Then ||x|| > 0 by (3b) and division gives || Ax||/||x|| ^ c. We obtain the smallest 
possible c valid for all x (^ 0) by taking the maximum on the left. This smallest c is 
called the matrix norm of A corresponding to the vector notm we picked and is denoted 
by ||A||. Thus 

(9) || A|| = max (x * 0), 

the maximum being taken over all x =£ 0. Alternatively [see (c) in Team Project 24], 

(10) ||A|| = max || Ax|| . 

ii ii || X |, = pi ii 

The maximum in (10) and thus also in (9) exists. And the name “matrix norm ” is 
justified because ||A|| satisfies (3) with x and y replaced by A and B. (Proofs in Ref. 
[El 7] pp. 77, 92-93.) 

Note carefully that ||A|| depends on the vector norm that we selected. In particular, 
one can show that 

for the ly norm (5) one gets the column “sum” norm (10), Sec. 20.3, 
for the /oo-norm (7) one gets the row “sum” norm (11), Sec. 20.3. 

By taking our best possible (our smallest) c = ||A|| we have from (8) 

(11) I|AxM||A||||x||. 

This is the formula we shall need. Formula (9) also implies for two n X n matrices (see 
Ref. [E17], p. 98) 

(12) ||AB|| S ||A|| ||B||, thus || A” || S ||A||’\ 
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EXAMPLE 4 


THEOREM 1 


PROOF 


See Refs. [E9] and [El 7] for other useful formulas on norms. 

Before we go on, let us do a simple illustrative computation. 

Matrix Norms 

Compute the matrix norms of the coefficient matrix A in Example 1 and of its inverse A" 1 , assuming that we 
use (a) the / r vector norm, (b) the /-o- vector norm. 

Solution . We use (4*), Sec. 7.8, for the inverse and then (10) and (1 1) in Sec. 20.3. Thus 


'0.9999 

- i.ooo r 

, -5000.0 

5000.5" 

A = 


A" 1 = 


. 1.0000 

-l.ooooj 

.-5000.0 

4999.5. 


(a) The ^-vector norm gives the column “sum” norm (10), Sec. 20.3; from Column 2 we thus obtain 
||A|| = |-1.0001| + |-1.0000| = 2.0001. Similarly, ||A _1 || = 10000. 

(b) The /oc-vector norm gives the row “sum” norm (11), Sec. 20.3; thus ||A|| = 2, ||A~ 1 || = 10000.5 from 

Row 1. We notice that ||A _1 || is surprisingly large, which makes the product ||A|| ||A _1 || large (20001). We 
shall see below that this is typical of an ill-conditioned system. ■ 


Condition Number of a Matrix 

We are now ready to introduce the key concept in our discussion of ill-conditioning, the 
condition number k( A) of a (nonsingular) square matrix A, defined by 

(13) k( A)= || A|| || A"" 1 1| . 

The role of the condition number is seen from the following theorem. 


Condition Number 

A linear system of equations Ax = b and its matrix A whose condition number (13) 
is small are well-conditioned. A large condition number indicates ill-conditioning . 


b = Ax and (11) give ||b|| ^ ||A||||x|| . Let b =£ 0 and x =£ 0. Then division by 

INI 11*11 s» ves 


(14) 


_L<M 
11*11 = INI ' 


Multiplying (2) r = A(x - x) by A 1 from the left and interchanging sides, we have 
x — x = A” 1 ^ Now (11) with A”* 1 and r instead of A and x yields 

II* — *11 = ||A -1 r|| ^ ||A -1 || ||r||. 

Division by ||x|| [note that ||x|| ^ 0 by (3b)] and use of (14) finally gives 

(15) T a M l|A ' 1|ll|r||s M l|A '‘ lll|r|| - K(A, H' 

Hence if k( A) is small, a small ||r||/||b|| implies a small relative error ||x - x||/||x||, so 
that the system is well-conditioned. However, this does not hold if k(A) is large; then a 
small ||r||/||b|| does not necessarily imply a small relative error ||x - x||/||x||. ■ 
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EXAMPLE 5 


EXAMPLE 6 


Condition Numbers. Gauss-Seidel Iteration 



’5 

1 

1" 


A ' 1 = i 

" 12 

-2 

—2“ 

A = 

I 

4 

2 

has the inverse 

-2 

19 

-9 


.1 

2 

4. 



-9 

19. 


Since A is symmetric. (10) and (1 1) in Sec. 20,3 give the same condition number 

k(A) = ||A|| ||A _i || = 7-^ -30 = 3.75. 

We see that a linear system Ax = b with this A is well-conditioned. 

For instance, if b = [14 0 28J T , the Gauss algorithm gives the solution x = [2 —5 9] T (confirm this). 

Since the main diagonal entries of A are relatively large, we can expect reasonably good convergence of the 
Gauss-Seidel iteration. Indeed, starting from, say, x 0 = [1 I 1 ] T . we obtain the first 8 steps (3D values) 



*2 

*3 

1.000 

1.000 

1.000 

2.400 

-1.100 

6.950 

1.630 

-3.882 

8.534 

1.870 

-4.734 

8.900 

1.967 

-4.942 

8.979 

1.993 

-4.988 

8.996 

1.998 

-4.997 

8.999 

2.000 

-5.000 

9.000 

2.000 

-5.000 

9.000 


Ill-Conditioned Linear System 

Example 4 gives by (10) or (11), Sec. 20.3, for the matrix in Example 1 the very large condition number 
k( A) = 2.0001 * 10 000 = 2 • 10 000.5 = 20 0001. This confirms that the system is very ill-conditioned. 
Similarly in Example 2. where by (4*), Sec. 7.8 and 6D-computation. 

_ x i r i.oooi -1.00001 r 5000.5 -5.000.01 
0.0002 L-1.0000 1.0001 J L -5000.0 5000.5J 

so that (10), Sec. 20.3, gives a very large k ( A), explaining the surprising result in Example 2, 

k { A) = (1.0001 + 1.0000(5000.5 + 5000.0) « 20 002. ■ 

In practice, A -1 will not be known, so that in computing the condition number k( A), one 
must estimate || A“ 1 ||. A method for this (proposed in 1979) is explained in Ref. [E9] 
listed in App. I . 


Inaccurate Matrix Entries. k( A) can be used for estimating the effect Sx of an 
inaccuracy SA of A (errors of measurements of the aj k , for instance). Instead of Ax = b 
we then have 

(A + SA)(x + Sx) = b. 

Multiplying out and subtracting Ax = b on both sides, we obtain 

ASx + SA(x + Sx) = 0. 

Multiplication by A” 1 from the left and taking the second term to the right gives 
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EXAMPLE 


Sx = —A 1 * 8A(x + Sx). 

Applying (11) with A - * 1 and vector SA(x *f Sx) instead of A and x, we get 
||Sx|| = ||A“ 1 6A(x 4- Sx ) || ^ HA-1 ||SA(x + Sx)\\ . 

Applying (l 1) on the right, with SA and x — Sx instead of A and x, we obtain 

INI s IIA-1 ||SA|| ||x + tell . 

Now ||A _1 || = k(A)/||A|| by the definition of k( A), so that division by ||x + Sx|| shows 
that the relative inaccuracy of x is related to that of A via the condition number by the 
inequality 


( 16 ) 


INI 

HI 


INI 

||x + Sx|| 


HA" 1 !! ||SA|| = k(A) 


l|SA|| 
l|A|| ' 


Conclusion. If the system is well-conditioned, small inaccuracies ||SA||/||A|| can have 
only a small effect on the solution. However, in the case of ill-conditioning, if ||SA||/||A|| is 
small, ||5x||/||x|| may be large. 

Inaccurate Right Side. You may show that, similarly, when A is accurate, an inaccuracy 
5b of b causes an inaccuracy 5x satisfying 


(17) 


M 

HI 


S k(A) 



Hence ||Sx||/||x|| must remain relatively small whenever k(A) is small. 

Inaccuracies. Bounds (16) and (17) 

If each of the nine entries of A in Example 5 is measured with an inaccuracy of 0.1. then ||t>A|| = 9-0.1 and 
(16) gives 

■Ipjp £ 7.5 • 3 ' 7 °' 1 = 0.321 thus ||8x|| £ 0.321 ||x|| = 0.321 • 16 = 5.14. 

By experimentation you will find that the actual inaccuracy ||5x|| is only about 30% of the bound 5.14. This is 
typical. 

Similarly, if 5b = [0.1 0.1 0.1 ] T , then ||5b|| = 0.3 and ||b|| = 42 in Example 5, so that (17) gives 

£ 7.5 • = 0.0536, hence ||Sx|| £ 0.0536 • 16 = 0.857 

but this bound is again much greater than the actual inaccuracy, which is about 0.15. ■ 

Further Comments on Condition Numbers. The following additional explanations 
may be helpful. 

1. There is no sharp dividing line between “well-conditioned” and “ill-conditioned ” 

but generally the situation will get worse as we go from systems with small /c(A) to systems 

with larger /c(A). Now always k(A) ^ l f so that values of 10 or 20 or so give no reason 
for concern, whereas k( A) = 100, say, calls for caution, and systems such as those in 
Examples 1 and 2 are extremely ill-conditioned. 
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2. If /c(A) is large (or small) in one norm, it will be large (or small, respectively) in 
any other norm. See Example 5. 

3. The literature on ill-conditioning is extensive. For an introduction to it, see [E9]. 

This is the end of our discussion of numerics for solving linear systems. In the next 
section we consider curve fitting, an important area in which solutions are obtained from 
linear systems. 



as 





1-8 


VECTOR NORMS 


Compute (5), (6), (7). Compute a corresponding unit vector 
(vector of norm 1) with respect to the /oo-norm. 

1. [1 -6 5] 


2. [0.4 -1.2 0 8.0] 


3. [-4 4 3 -3] 

4. [0 0 1 0 0] 

5. [0.3 -0.1 0.5 1.0] 

6. [16 21 54 -119] 

7. [1 1 1 1 1 1] 

8. [3 0 0 -3 0] 


9. Show that M°c ^ l|x|| 2 ^ Mi- 


ll)-} 5 


MATRIX NORMS, 
CONDITION NUMBERS 


Compute the matrix norm and the condition number 
corresponding to the /j -vector norm. 



‘-3 

4 1 



“5 

7 



10. 

. 1 

J 


11. 

.7 

10 

- 



V3 




f 0 


0 

100“ 


3 1 







12. 

_ 0 

-V 3 J 


13. 

0 


100 

0 






.0.01 


0 

0 . 


" 21 

10.5 


7 

5.25“ 





10.5 

7 


5.25 

4.2 




14. 

7 

5.25 


4.2 

3.5 





.5.25 

4.2 


3.5 

3 _ 





“ 1 

0.1 

0 

-1 





15. 

0.1 

1 

0.1 







L 0 0.1 1 J 


16. Verify (11) for x = [4 —5 2] T taken with the 
/oc-norm and the matrix in Prob. 15. 

17. Verify (12) for the matrices in Probs. 10 and 11. 

18. Verify the calculations in Examples 5 and 6 of the text. 


1 19-20 1 ILL-CONDITIONED SYSTEMS 

Solve Ax = hi. Ax = b 2 , compare the solutions, and 
comment. Compute the condition number of A. 


“ 2 

1 . 4 "] n 

• 4 i 

“1.44“ 

19. A = 

,b] = 

( ,b 2 = 


.1.4 

1 J L 


. 1 . 


r 5 -7-1 r- 

-2“| 

'- 2 ' 

20. A = 

, bx = 

» b 2 = 


L — 7 

10J L 

3.J 

_3.1_ 


21. (Residual) For Ax = b a in Prob. 19 guess what the 

residual of x = [113 — 1 60] T might be (the solution 

being x = [0 1 ] T ). Then calculate and comment. 

22. Show that /c(A) ^ 1 for the matrix norms (10), (11), 
Sec. 20.3, and k ( A) ^ V/? for the Frobenius norm (9), 
Sec. 20.3. 

23. CAS EXPERIMENT. Hilbert Matrices. The 3 X 3 
Hilbert matrix is 



1 l 1 

L 3 4 5j 


The n X n Hilbert matrix is H n = where 

hj k = \f(j + k - 1). (Similar matrices occur in curve 
fitting by least squares.) Compute the condition number 
ff(Hn) for the matrix norm corresponding to the /oc- (or 
lr) vector norm, for n = 2, 3, ■ • • , 6 (or further if you 
wish). Try to find a formula that gives reasonable 
approximate values of these rapidly growing numbers. 
Solve a few linear systems involving an 1^ of your 
choice. 
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24. TEAM PROJECT. Norms, (a) Vector norms in our 
text are equivalent, that is, they are related by double 
inequalities; for instance, 

(a) || x Hoc = || x || , = hIIxII-c 

(18) 

(b) -^llxlU S HxlU s HxIU. 

Hence if for some x, one norm is large (or small), the 
other norm must also be large (or small). Thus in many 
investigations the particular choice of a norm is not 
essential. Prove (18). 

(b) The Cauchy-Schwarz inequality is 

|x T yl & ||x|| 2 ||y|| 2 . 

It is very important. (Proof in Ref. [GR7] listed in 
App. 1.) Use it to prove 

(19a) ||x|| 2 S ||x||i S Vn||x|| 2 


(19b) - 7 = l|x||i a ||x|| 2 a ||x|| 

V/? 

(c) Formula (10) is often more practical than (9). 
Derive (10) from (9). 

(d) Matrix norms. Illustrate (11) with examples. 
Give examples of (12) with equality as well as with 
strict inequality. Prove that the matrix norms (10), 
(11) in Sec. 20.3 satisfy the axioms of a norm 

llA|| a o. 

|| A|| = 0 if and only if A = 0, 

II*a|| = 1*1 I|a||, 

II A + B|| a || A|| + ||B||. 

25. WRITING PROJECT. Norms and Their Use in 
This Section. Make a list of the most important of the 
many ideas covered in this section and write a two-page 
report on them. 


20.5 Least Squares Method 

Having discussed numerics for linear systems, we now turn to an important application, 
curve fitting, in which the solutions are obtained from linear systems. 

In curve fitting we are given n points (pairs of numbers) (x ls Ji), • * • , (x n , y n ) and we 
want to determine a function /( x) such that 


fix l) /CO = ,v»> 

approximately. The type of function (for example, polynomials, exponential functions, sine 
and cosine functions) may be suggested by the nature of the problem (the underlying physical 
law. for instance), and in many cases a polynomial of a certain degree will be appropriate. 
Let us begin with a motivation. 

If we require strict equality f(x{) = y 1? • • • , /(A* n ) = y n and use polynomials of 
sufficiently high degree, we may apply one of the methods discussed in Sec. 19.3 in 
connection with interpolation. However, in certain situations this would not be the 
appropriate solution of the actual problem. For instance, to the four points 

(1) (-1.3, 0.103), (-0.1, 1.099), (0.2, 0.808), (1.3, 1.897) 

there corresponds the interpolation polynomial /( x) = a 3 - x -f 1 (Fig. 443), but if we 


Fig. 443. 



Approximate fitting of a straight line 
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graph the points, we see that they lie nearly on a straight line. Hence if these values are 
obtained in an experiment and thus involve an experimental error, and if the nature of the 
experiment suggests a linear relation, we better fit a straight line through the points (Fig. 
443). Such a line may be useful for predicting values to be expected for other values of 
x . A widely used principle for fitting straight lines is the method of least squares by 
Gauss and Legendre. In the present situation it may be formulated as follows. 


Method of Least Squares. The straight line 
(2) y = a + hx 

should be fitted through the given points (a* 1s y x ), • • • , (x n , y n ) so that the sum of 
the squares of the distances of those points from the straight line is minimum, where 
the distance is measured in the vertical direction ( the y-direction ). 


The point on the line with abscissa Xj has the ordinate a + bxj. Hence its distance from 
( Xj , yj) is \yj — a — bx 3 \ (Fig. 444) and that sum of squares is 

n 

<7 = 2 O’j ~ ~ bXjf. 

j = i 

q depends on a and b. A necessary condition for q to be minimum is 


(3) 


- = -22 O’j -a- bXj) = 0 

$ = -2 2 (>’i - <* - = 0 


(where we sum over j from 1 to n). Dividing by 2, writing each sum as three sums, and 
taking one of them to the right, we obtain the result 


an + b 2 Xj = 2 

a^xj + b^ xf = 2 x&i' 


These equations are called the normal equations of our problem. 



Fig. 444. Vertical distance of a point (xj, y 3 ) 
from a straight line y = a + bx 
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EXAMPLE 1 


Straight Line 

Using the method of least squares, fit a straight line to the four points given in formula (1). 

Solution . We obtain 

n = 4, 2 Xj = 0.1, 2 -v/ = 3.43, 2 3’j = 3.907, 2 ^ = 2.3839. 

Hence the normal equations are 

4 a + 0.l0/> = 3.9070 
0.1 « + 3.436 = 2.3839. 

The solution (rounded to 4D) is a — 0.9601, b = 0.6670, and we obtain the straight line (Fig. 443) 

v = 0.9601 + 0.6670.V. ■ 


Curve Fitting by Polynomials of Degree m 

Our method of curve fitting can be generalized from a polynomial y = a 4- bx to a 
polynomial of degree m 

(5) p(x) = b Q + b x x + - - - + b m x m 

where m ^ n — 1. Then q takes the form 

n 

4 = 2 (Vj - P(Xj)) 2 
j-l 


and depends on m + 1 parameters b 0 , • • • , b m . Instead of (3) we then have m + 1 
conditions 


( 6 ) 



J)q_ 

<>b m 


= 0 


which give a system of m 4- 1 normal equations. 

In the case of a quadratic polynomial 

(7) p{x) = b 0 + b x x + btf 2 


the normal equations are (summation from l to n) 

b 0 n + h 2 Xf + b 2 2 x i = 2 .Vj 

(8) b 0 2 Xj + b 1 2 xf + b 2 2 xf = 2 xjyj 

bo 2 xf + bx 2 xf + b 2 2 */ = 2 V )’j- 


The derivation of (8) is left to the reader. 
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EXAMPLE 2 Quadratic Parabola by Least Squares 

Fit a parabola through the data (0. 5), (2, 4), (4, 1), (6, 6), (8, 7). 

Solution. For the normal equations we need n = 5, Xxj = 20, 2aj 2 = 120, 2 x* = 800. 2a j 4 = 5664, 
'Zyj = 23, = 104. 2.Vj 2 Vj = 696. Hence these equations are 

56 0 + 20/7 X 4* 120^ 2 = 23 
20^o + 120 b! + 800/>2 = 104 
120/? o + 800 + 5664fc 2 = 696. 

Solving them we obtain the quadratic least squares parabola (Fig. 445) 

y = 5.1 1429 - 1.41429a* + 0.21429a 2 . ■ 



Fig. 445. Least squares parabola in Example 2 


For a general polynomial (5) the normal equations form a linear system of equations in the 
unknowns b 0 , • • • , b m . When its matrix M is nonsingular, we can solve the system by 
Cholesky’s method (Sec. 20.2) because then M is positive definite (and symmetric). When 
the equations are nearly linearly dependent, the normal equations may become ill- 
conditioned and should be replaced by other methods; see [E5], Sec. 5.7, listed in App. 1. 

The least squares method also plays a role in statistics (see Sec. 25.9). 



1-6 


FITTING A STRAIGHT LINE 


Fit a straight line to the given points (jc, y) by least squares. 
Show the details. Check your result by sketching the points 
and the line. Judge the goodness of fit. 


1. (2, 0), (3, 4), (4, 10), (5, 16) 


2. How does the line in Prob. 1 change if you add a point 
far above it, say, (3, 20)? 

3. (2.5, 8.0), (5.0, 6.9), (7.5, 6.2), (10.0, 5.0) 

4. (Ohm’s law U = Ri) Estimate the resistance R from 
the least squares line that fits (/, U) = (2.0, 104), 
(4.0, 206), (6.0, 3 14), ( 1 0.0, 530). 

5. (Average speed) Estimate the average speed v ay of a 
car traveling according to s = v . • / [km] (.? = distance 


traveled, t [h] = time) from (/, s) = (9, 140), (10, 220), 
(11,310), (12,410). 

6. (Hooke’s law F = ks) Estimate the spring modulus k 
from the force F [lb] and extension s [cm], where 
(F, s) = (1, 0.50), (2, 1.02), (4, 1.99), (6, 3.01). 
(10, 4.98), (20, 10.03). 


7. Derive the normal equations (8). 


8-10 


FITTING A QUADRATIC PARABOLA 


Fit a parabola (7) to the given points (jc, y) by least squares. 
Check by sketching. 

8. (-1, 3), (0, 0), (1, 2), (2, 8) 


9. (0, 4), (2, 2), (4,-1), (6, -5) 
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10. Worker’s time on duty .v [h] 12 3 4 5 

Worker's reaction lime [sec] 1.50 1.28 1.40 1.85 2.20 

11. Fit (2) and (7) by least squares to ( — 1 .0. 5.4), (-0.5, 4.1), 
(0, 3.9). (0.5. 4.8). (1.0, 6.3), (1.5, 9.3). Graph the data 
and the curves on common axes and comment. 

12. (Cubic parabola) Derive the formula for the normal 
equations of a cubic least squares parabola. 

13. Fit curves (2) and (7) and a cubic parabola by least 
squares to (-2, -35), (-1, -9), (0, - I), (1, -1), 
(2, 17), (3, 63). Graph the three curves and the points 
on common axes. Comment on the goodness of fit. 

14. CAS PROJECT. Least Squares. Write programs for 
calculating and solving the normal equations (4) and 
(8). Apply the programs to Probs. 3, 5, 9, 11. If your 
CAS has a command for fitting (Maple and 
Maihematica do), compare your results with those by 
your CAS commands. 

15. CAS EXPERIMENT. Least Squares versus 
Interpolation. For the given data and for data of your 
choice find the interpolation polynomial and the least 
squares approximations (linear, quadratic, etc.). 
Compare and comment. 

(a) (-2, 0), (-1,0), (0, 1), (1,0), (2, 0) 

(b) (-4, 0), (-3, 0), (-2, 0), (-1, 0), (0, 1), 
(1,0), (2, 0), (3, 0). (4, 0) 

(c) Choose five points on a straight line, e.g., (0, 0), 
(l, 1), * ■ * , (4, 4). Move one point 1 unit upward and 
find the quadratic least squares polynomial. Do this 
for each point. Graph the five polynomials on 
common axes. Which of the five motions has the 
greatest effect? 
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16. TEAM PROJECT. The least squares approximation 
of a function /( x) on an interval a ^ a* ^ b by a function 

FmW = <* 0 y Q {x) + a x yi(x) + • * * + a m y m (x) 

where y 0 (x), • • • , y m (-v) are given functions, requires the 
determination of the coefficients a 0i ’ * * » such that 

r b 

(9) J [/« - F m (x)] 2 dx 

becomes minimum. This integral is denoted by 
11/ ~ F'mll 2 . and ||/ - F m || is called the L. 2 -n orm of 
/ — F m ( L suggesting Lebesgue 2 ). A necessary condition 
for that minimum is given by d||/ - F m \\ 2 /dcij = 0, 
j = 0, • • • , m [the analog of (6)]. (a) Show that this 
leads to m + I normal equations (j = 0, • • • . m) 

m 

2 hjk a k = where 

(10) h jk = J Vj(x)y k (x) dx, 

J a 

bj = [ f(x)yj(x) dx. 

J a 

(b) Polynomial. What form does (10) take if 
F m (x) = a 0 + x + • • • + a m x m ? What is the 
coefficient matrix of (10) in this case when the interval 
is 0 = a* = 1 ? 

(c) Orthogonal functions. What are the solutions of 
(10) if v 0 (a), • • • , y m ( a) are orthogonal on the interval 
ci = x = /?? (For the definition, see Sec. 5.7. See also 
Sec. 5.8.) 


20.6 Matrix Eigenvalue Problems: Introduction 

In the remaining sections of this chapter we discuss some of the most important ideas and 
numeric methods for matrix eigenvalue problems. This very extensive part of numeric 
linear algebra is of great practical importance, with much research going on, and hundreds, 
if not thousands of papers published in various mathematical journals (see the references 
in [E8], [E9], [Ell], [E29]). We begin with the concepts and general results we shall need 
in explaining and applying numeric methods for eigenvalue problems. (For typical models 
of eigenvalue problems see Chap. 8.) 


2 HENRI LEBESGUE (1875-1941), great French mathematician, creator of a modern theory of measure and 
integration in his famous doctoral thesis of 1902. 
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THEOREM 1 


An eigenvalue or characteristic value (or latent root) of a given n X n matrix A = [a jk ] 
is a real or complex number A such that the vector equation 

(1) Ax = Ax 

has a nontrivial solution, that is, a solution x ^ 0, which is then called an eigenvector or 
characteristic vector of A corresponding to that eigenvalue A. The set of all eigenvalues 
of A is called the spectrum of A. Equation ( 1 ) can be written 

(2) (A — AI)x = 0 

where I is the n X n unit matrix. This homogeneous system has a nontrivial solution if 
and only if the characteristic determinant det (A — AI) is 0 (see Theorem 2 in Sec. 7.5). 
This gives (see Sec. 8.1) 


Eigenvalues 

The eigenvalues of A are the solutions A of the characteristic equation 

(3) det (A - AI) = 

tin - A 
a 21 

°12 

a 22 ~ A 

a \ n 
a 2n 

= 0. 


a nl 

a n2 

a nn ~ A 



Developing the characteristic determinant, we obtain the characteristic polynomial of A, 
which is of degree n in A. Hence A has at least one and at most n numerically different 
eigenvalues. If A is real, so are the coefficients of the characteristic polynomial. By familiar 
algebra it follows that then the roots (the eigenvalues of A) are real or complex conjugates 
in pairs. 

We shall usually denote the eigenvalues of A by 

Ai, ^2> * * * ? Ki 

with the understanding that some (or all) of them may be equal. 

The sum of these n eigenvalues equals the sum of the entries on the main diagonal of 
A, called the trace of A; thus 

n n 

(4) trace A = 2 % = X A fc . 

fc=i 

Also, the product of the eigenvalues equals the determinant of A, 

(5) det A = A 1 A 2 * * * A^. 

Both formulas follow from the product representation of the characteristic polynomial, 
which we denote by /(A), 


/(A) — ( l) n (A A X )(A — A 2 ) • • • (A — A n ). 
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THEOREM 2 


THEOREM 3 


THEOREM 4 


If we take equal factors together and denote the numerically distinct eigenvalues of A by 
A x , • • • , A r (r ^ n\ then the product becomes 

(6) /(A) = (— l) n (A - A x ) mi (A - A 2 f* • • • (A - A/\ 

The exponent nij is called the algebraic multiplicity of A j. The maximum number of 
linearly independent eigenvectors corresponding to A^- is called the geometric multiplicity 
of )y. It is equal to or smaller than ny. 

A subspace S of R n or C n (if A is complex) is called an invariant subspace of A if 
for every v in S the vector Av is also in 5. Eigenspaces of A (spaces of eigenvectors; 
Sec. 8.1) are important invariant subspaces of A. 

An n X n matrix B is called similar to A if there is a nonsingular n X n matrix T such 
that 

(7) B = T~ 1 AT. 

Similarity is important for the following reason. 


Similar Matrices 


Similar matrices have the same eigenvalues. If x is an eigenvector of A, then 
y = T~*x is an eigenvector of B in (7) corresponding to the same eigenvalue. (Proof 
in Sec. 8.4.) 


Another theorem that has various applications in numerics is as follows. 


Spectral Shift 

If A has the eigenvalues \ l9 • • • t A„, then A — kl with arbitrary k has the eigenvalues 
\ x - fc • • • , A n — L 


This theorem is a special case of the following spectral mapping theorem. 


Polynomial Matrices 

If A is an eigenvalue of A, then 


q( A) = or s A s + a s _!A s-1 + • 

• • 4* oti A 4* a 0 

is an eigenvalue of the polynomial matrix 


A) = 0CgA s + a s _jA s-1 + • • 

• + qtjA 4- ctqI. 


Ax = Ax implies A 2 x = AAx = AAx = A 2 x, A 3 x = A 3 x, etc. Thus 

q(A)x = ( a s A s + a^A* -1 4- • • -)x 
= a s A s x 4* a $ _ x A s ~ l x 4- • • • 

= a $ \*x + g^A^x 4- • • • = q( A)x. ■ 


PROOF 
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The eigenvalues of important special matrices can be characterized as follows. 


THEOREM 5 


Special Matrices 

The eigenvalues of Hermitian matrices (i.e., A T = A), hence of real symmetric 
matrices (i.e., A T = A), are real The eigenvalues of skew -Hermitian matrices (i.e., 
A = —A), hence of real skew-symmetric matrices (i.e., A T = —A) are pure 
imaginary or 0. The eigenvalues of unitary matrices (i.e., A = A" 1 ), hence of 
orthogonal matrices (i.e., A T = A” 1 ), have absolute value 1. (Proofs in Secs. 8.3 
and 8.5.) 


The choice of a numeric method for matrix eigenvalue problems depends essentially on 
two circumstances, on the kind of matrix (real symmetric, real general, complex, sparse, 
or full) and on the kind of information to be obtained, that is, whether one wants to know 
all eigenvalues or merely specific ones, for instance, the largest eigenvalue, whether 
eigenvalues and eigenvectors are wanted, and so on. It is clear that we cannot enter into 
a systematic discussion of all these and further possibilities that arise in practice, but we 
shall concentrate on some basic aspects and methods that will give us a general 
understanding of this fascinating field. 

20.7 Inclusion of Matrix Eigenvalues 

The whole numerics for matrix eigenvalues is motivated by the fact that except for a few 
trivial cases we cannot determine eigenvalues exactly by a finite process because these 
values are the roots of a polynomial of /?th degree. Hence we must mainly use iteration. 

In this section we state a few general theorems that give approximations and error 
bounds for eigenvalues. Our matrices will continue to be real (except in formula (5) below), 
but since (nonsymmetric) matrices may have complex eigenvalues, complex numbers will 
play a (very modest) role in this section. 

The important theorem by Gerschgorin gives a region consisting of closed circular disks 
in the complex plane and including all the eigenvalues of a given matrix. Indeed, for each 
j = 1, • • • , n the inequality (1) in the theorem determines a closed circular disk in the 
complex A-plane with center a^ and radius given by the right side of (1); and Theorem 
1 states that each of the eigenvalues of A lies in one of these n disks. 


THEOREM I 


Gerschgorin’s Theorem 

Let A be an eigenvalue of an arbitrary n X n matrix A = [aj k ]. Then for some 
integer j (1 =./ = n) we have 

(1) | ajj - A| S Ifl^l + \a i2 \ + • • • + + • • • + Iflj-J. 


PROOF Let x be an eigenvector corresponding to an eigenvalue A of A. Then 

(2) Ax = Ax or (A - AI)x = 0. 

Let Xj be a component of x that is largest in absolute value. Then we have \x m lx jt | S I for 
in — !,•••. n. The vector equation (2) is equivalent to a system of n equations for the 
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n components of the vectors on both sides. The yth of these n equations with j as just 
indicated is 


Gji x i + • • • 4* 4* (cijj A)a j 4- fijj+i-Xj+i “b • • • 4- cij n x n 0. 

Division by Xj (which cannot be zero; why?) and reshuffling terms gives 


Qjj A Clji 


a jJ - 1 


X J - 1 _ „ 1 

v . a jJ + 1 




By taking absolute values on both sides of this equation, applying the triangle inequality 
\a 4- b\ ^ |a| 4- |Z?| (where a and b are any complex numbers), and observing that because 
of the choice of j (which is crucial!), Iat/a;,! ^ 1, • • • , \x n /Xj\ ^ 1, we obtain (1), and the 
theorem is proved. ■ 


EXAMPLE 1 Gerschgorin’s Theorem 

For ihe eigenvalues of the matrix 



" 0 

1/2 

1/2“ 

A = 

1/2 

5 

1 


.1/2 

1 

1 _ 


we gel the Gerschgorin disks (Fig. 446) 

Dy. Center 0, radius 1. D 2 : Center 5, radius 1.5, D 3 : Center 1. radius 1.5. 

The centers are the main diagonal entries of A. These would be the eigenvalues of A if A were diagonal. 
We can take these values as crude approximations of the unknown eigenvalues (3D values) \y - -0.209, 
A 2 = 5.305, A 3 = 0.904 (verify tliis): then the radii of the disks are corresponding error bounds. 

Since A is symmetric, it follows from Theorem 5, Sec. 20.6, that the spectrum of A must actually lie in the 
intervals [-1, 2.5] and [3.5. 6.5]. 

It is interesting that here the Gerschgorin disks form two disjoint sets, namely, Dy U Z) 3 , which contains two 
eigenvalues, and D 2 , which contains one eigenvalue. This is typical, as the following theorem shows. H 



Fig. 446. Gerschgorin disks in Example 1 


THEOREM 2 


Extension of Gerschgorin’s Theorem 

If p Gerschgorin disks form a set S that is disjoint from the n — p other disks of a 
given matrix A, then S contains precisely p eigenvalues of A ( each counted with its 
algebraic multiplicity , as defined in Sec. 20.6). 


Idea of Proof. Set A — B + C, where B is the diagonal matrix with entries and 
apply Theorem 1 to A t = B 4- t C with real / growing from 0 to 1 . ■ 




868 


CHAP. 20 Numeric Linear Algebra 


EXAMPLE 


THEOREM 


Another Application of Gerschgorin’s Theorem. Similarity 


Suppose that we have diagonalized a matrix by some numeric method that left us with some off-diagonal entries 
of size 10“ 5 . say. 


A = 


I O ’ 5 
J0~ 5 


1(T 5 I0 _5 ~ 

2 10“ 5 . 

I0” 5 4 _ 


What can we conclude about deviations of the eigenvalues from the main diagonal entries? 

Solution . By Theorem 2. one eigenvalue must lie in the disk of radius 2* 10“ 5 centered at 4 and two 
eigenvalues (or an eigenvalue of algebraic multiplicity 2) in the disk of radius 2 • 10“ 5 centered at 2. Actually, 
since the matrix is symmetric, these eigenvalues must lie in the intersections of these disks and the real axis, 
by Theorem 5 in Sec. 20.6. 

We show how an isolated disk can always be reduced in size by a similarity transformation. The matrix 



”l 

0 

0 


2 

10~ s 

to -5 " 


"l 

0 

0 

w 

II 

H 

1 

> 

H 

II 

0 

1 

0 


1(T 5 

2 

i<r s 


0 

1 

0 


.0 

0 

I(T 5 . 


1 

© 

1 

Ol 

10" s 

4 . 


.0 

0 

10 5 _ 


2 10" 5 I 


10" 


10" ]o 


2 1 

IO" 10 4. 


is similar to A. Hence by Theorem 2, Sec. 20.6. it has the same eigenvalues as A. From Row 3 we get the smaller 
disk of radius 2 • IO” 10 . Note that the other disks got bigger, approximately by a factor of 10 5 . And in choosing 
T we have to watch that the new disks do not overlap with the disk whose size we want to decrease. 

For further interesting facts, see the new book [E28J. H 


By definition, a diagonally dominant matrix A = [tf j7c ] is an n X n matrix such that 


(3) \cijj\ = 2 Itfjkl j I? * * * > n 

k*j 

where we sum over all off-diagonal entries in Row j. The matrix is said to be strictly 
diagonally dominant if > in (3) for all j. Use Theorem 1 to prove the following basic 
property. 


Strict Diagonal Dominance 

Strictly diagonally dominant matrices are nonsingular. 


Further Inclusion Theorems 

An inclusion theorem is a theorem that specifies a set which contains at least one 
eigenvalue of a given matrix. Thus, Theorems 1 and 2 are inclusion theorems; they even 
include the whole spectrum. We now discuss some famous theorems that yield further 
inclusions of eigenvalues. We state the first two of them without proofs (which would 
exceed the level of this book). 
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THEOREM 4 


EXAMPLE 3 


THEOREM 5 


Schur’s Theorem 3 

Let A = [cij k ] be an n X n matrix. Then for each of its eigenvalues A 1? • • • , A n , 

n n n 

(4) |Aj 2 S 2 I Ail 2 |«jfc| 2 (Schur’s inequality). 

i= 1 

In (4) the second equality sign holds if and only if A is such that 

(5) A T A = AA T 


Matrices that satisfy (5) are called normal matrices. It is not difficult to see that Hermitian, 
skew-Hermitian, and unitary matrices are normal, and so are real symmetric, skew-symmetric, 
and orthogonal matrices. 


Bounds for Eigenvalues Obtained from Schur’s Inequality 


For the matrix 



2 " 

4 

28 . 


we obtain from Schur’s inequality |A| ^ V 1949 = 44.1475. You may verify that the eigenvalues are 30. 25. 
and 20. Thus 30 2 + 25 2 + 20 2 = 1925 < 1949; in fact, A is not normal. ■ 


The preceding theorems are valid for every real or complex square matrix. Other theorems 
hold for special classes of matrices only. Famous is the following. 


Perron’s Theorem 4 

Let Abe a real n X n matrix whose entries are all positive. Then A has a positive 
real eigenvalue A = p of multiplicity 1 . The corresponding eigenvector can be chosen 
with all components positive. ( The other eigenvalues are less than p in absolute 
value.) 


For a proof see Ref. [B3], vol. II, pp. 53-62. The theorem also holds for matrices with 
nonnegative real entries (‘Terron-Frobenius Theorem” 4 ) provided A is irreducible, 
that is, it cannot be brought to the following form by interchanging rows and columns; 
here B and F are square and 0 is a zero matrix. 

“B C“ 

.0 F_ 


3 ISSAI SCHUR (1875-1941), German mathematician, also known by his important work in group theory. 

4 OSKAR PERRON (1880-1975), GEORG FROBENI US (1849-1917), LOTHAR COLLATZ (1910-1990). 
German mathematicians, known for their work in potential theory, ODEs (Sec. 5.4) and group theory, and 
numerics, respectively. 
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THEOREM 6 


PROOF 


EXAMPLE 4 


Perron’s theorem has various applications, for instance, in economics. It is interesting 
that one can obtain from it a theorem that gives a numeric algorithm: 


Collatz Inclusion Theorem 4 

Let A = [a jk ] be a real nX n matrix whose elements are all positive . Let x be any 
real vector whose components jc x , • • • , x n are positive , and let y lf ,y n be the 
components of the vector y = Ax. Then the closed interval on the real axis bounded 
by the smallest and the largest of the n quotients qj = yj/xj contains at least one 
eigenvalue of A. 


We have Ax = y or 

(6) y - Ax = 0. 

The transpose A T satisfies the conditions of Theorem 5. Hence A T has a positive eigenvalue 
A and, corresponding to this eigenvalue, an eigenvector u whose components Uj are all 
positive. Thus A T u = Au, and by taking the transpose we obtain u T A = Au T . From this 
and (6) we have 

u T (y - Ax) = u T y - u T Ax = u T y - Au T x = u T (y - Ax) = 0 
or written out 

n 

2 “ Ajcj) = 0. 

3=1 

Since all the components Uj are positive, it follows that 

yj — A xj ^ 0, that is, q* ^ A for at least one j> 

(7) and 

yj — Kxj 0, that is, cjj ^ A for at least one j. 

Since A and A T have the same eigenvalues, A is an eigenvalue of A, and from (7) the 
statement of the theorem follows. ■ 

Bounds for Eigenvalues from Collatz’s Theorem. Iteration 

For a given matrix A with positive entries we choose an x = Xq and iterate, that is. we compute 
Xi = Ax 0 . x 2 = Ax lt • * • , x 20 = Ax 19 . In each step, taking x = Xj and y = A Xj - x^ +1 we compute an 
inclusion interval by Collaiz’s theorem. This gives (6S) 



'0.49 

0.02 

0.22' 


“f 


'0.73' 


'0.5481’ 

A = 

0.02 

0.28 

0.20 

,x 0 = 

1 

= 

0.50 

,x 2 = 

0.3186 


. 0.22 

0.20 

0.40. 


J. 


.0.82. 


.0.5886. 



"0.00216309" 


"0.00155743 " 

x 19 ~ 

0.00108155 

II 

1 

0.000778713 


.0.00216309. 


.0.00155743 . 
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and the intervals 0.5 ^ A ^ 0.82, 0.3186/0.50 = 0.6372 A ^ 0.5481/0.73 = 0.750822, etc. These intervals 
have length 


j 

1 

2 

3 

10 

15 

20 

Length 

0.32 

0.113622 

0.0539835 

0.0004217 

0.0000132 

0.0000004 


Using the characteristic polynomial, you may verify that the eigenvalues of A are 0.72, 0.36, 0.09. so that those 
intervals include the largest eigenvalue, 0.72. Their lengths decreased with j, so that the iteration was worthwhile. 
The reason will appear in the next section, where we discuss an iteration method for eigenvalues. ■ 


PROBLEM SET 20.7 


GERSCHGORIN DISKS 

Find and sketch disks or intervals that contain the 
eigenvalues. If you have a CAS, find the spectrum and 
compare. 



7. (Similarity) Find T” t AT such that in Prob. 2 the 
radius of the Gerschgorin circle with center 5 is reduced 
by a factor 1/100. 


8. By what integer factor can you at most reduce the 
Gerschgorin circle with center 3 in Prob. 6? 

9. If a symmetric n X n matrix A = [a jk ] has been 
diagonalized except for small off-diagonal entries of 
size 10“ 6 , what can you say about the eigenvalues? 

10. (Extended Gerschgorin theorem) Prove Theorem 2. 

11. Prove Theorem 3. 

12. (Normal matrices) Show that Hermitian, skew- 
Hermitian, and unitary matrices (hence real symmetric, 
skew-symmetric, and orthogonal matrices) are normal. 
Why is this of practical interest? 

13. (Spectral radius p(A)) Show that p( A) cannot be 
greater than the row sum norm of A. 

14. (Eigenvalues on the circle) Illustrate with a 2 X 2 
matrix that an eigenvalue may very well lie on a 
Gerschgorin circle (so that Gerschgorin disks can 
generally not be replaced with smaller disks without 
losing the inclusion property). 

15-17 1 SCHUR’S INEQUALITY 

Use (4) to obtain an upper bound for the spectral radius: 

15. In Prob, 1 

16. In Prob, 6 

17. In Prob. 3 

18-19 1 COLLATZ’S THEOREM 

Apply Theorem 6, choosing the given vectors as vectors x. 
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20. CAS EXPERIMENT. Collatz Iteration, (a) Write 
a program for the iteration in Example 4 (with any 
A and x 0 ) that at each step prints the midpoint 
(why?), the endpoints, and the length of the inclusion 
interval. 


(b) Apply the program to symmetric matrices of your 
choice. Explore how convergence depends on the 
choice of initial vectors. Can you construct cases in 
which the lengths of the inclusion intervals are not 
monotone decreasing? Can you explain the reason? 
Can you experiment on the effect of rounding? 


20.8 Power Method for Eigenvalues 

A simple standard procedure for computing approximate values of the eigenvalues of an 
n X n matrix A = [cij k ] is the power method. In this method we start from any vector 
x 0 (=£ 0) with n components and compute successively 


x x = Ax 0 , x 2 = Ax x , * • , x s = Ax s _ v 

For simplifying notation, we denote x s _ x by x and x s by y, so that y = Ax. 

The method applies to any n X n matrix A that has a dominant eigenvalue (a A such 
that |A| is greater than the absolute values of the other eigenvalues). If A is symmetric , it 
also gives the error bound (2), in addition to the approximation (1). 


Power Method, Error Bounds 


Let A be an n X n real symmetric matrix. Let x (¥= 0) be any real vector with n 

components. Furthermore ; let 


y = Ax, m 0 = x T x, m 1 = x T y, 

m 2 = y T y- 

Then the quotient 


>-* 

1! 

a la 

o 

(Rayleigh 5 quotient) 

is an approximation for an eigenvalue A of A (usually that which is greatest in 

absolute value, but no general statements are possible). 


Furthermore , if we set q = A — e, so that € is the error of q, then 




^ORD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919), great English physicist and mathematician, 
professor at Cambridge and London, known for his important contributions to various branches of applied 
mathematics and theoretical physics, in particular, the theory of waves, elasticity, and hydrodynamics. In 1904 
he received a Nobel Prize in physics. 
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PROOF 


8 Z denotes the radicand in (2). Since m x = qm Q by (1), we have 

(3) (y ~ qx) J (y - qx) = m 2 ~ 2 qm x 4* q 2 m 0 = m 2 - q 2 m 0 = S 2 m 0 . 

Since A is real symmetric, it has an orthogonal set of n real unit eigenvectors z ls • • • , z n 
corresponding to the eigenvalues A 1? • • • , A n , respectively (some of which may be equal). 
(Proof in Ref. [B3], vol. 1, pp. 270-272, listed in App. 1.) Then x has a representation of 
the form 


x = a x z x + • • • + a n z n . 
Now Az x = AiZ : , etc., and we obtain 


y = Ax = a x AjZj + • • • + ci n X, t z n 


and, since the Zj are orthogonal unit vectors, 

(4) m 0 = x T x = a i 2 + • • ■ H- a n 2 . 

It follows that in (3), 

y - qx = a x ( A : - q)z x + • • • + a n { A,, - q) z n . 

Since the Zj are orthogonal unit vectors, we thus obtain from (3) 

(5) S 2 m 0 = (y - qx) T (y - qx) = a 1 2 (A 1 - q) 2 + • ■ ■ + «n(Ki - ^) 2 - 

Now let X c be an eigenvalue of A to which q is closest, where c suggests “closest”. Then 
(A c — q) 2 ^ (A j — q) 2 for j = 1, • • * , n. From this and (5) we obtain the inequality 

S 2 m 0 ^ (A c - q) 2 (a x 2 + • • • + a n 2 ) = (A c - ^) 2 /?2 0 . 

Dividing by m 0 > taking square roots, and recalling the meaning of S 2 gives 



This shows that 5 is a bound for the error e of the approximation q of an eigenvalue of 
A and completes the proof. ■ 


The main advantage of the method is its simplicity. And it can handle sparse matrices 
too large to store as a full square array. Its disadvantage is its possibly slow convergence. 
From the proof of Theorem 1 we see that the speed of convergence depends on the ratio 
of the dominant eigenvalue to the next in absolute value (2:1 in Example 1, below). 

If we want a convergent sequence of eigenvectors, then at the beginning of each step 
we scale the vector, say, by dividing its components by an absolutely largest one, as in 
Example 1, as follows. 
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EXAMPLE 1 


EXAMPLE 2 


Application of Theorem 1. Scaling 

For the symmetric matrix A in Example 4, Sec. 20.7, and x 0 = [I I I] T we obtain from ( 1 ) and (2) and the 
indicated scaling 



"0.49 

0.02 

0.22" 


“1" 


"0.890244" 


"0.931193" 

A = 

0.02 

0.28 

0.20 

, x 0 = 

1 

, Xj 

= 

0.609756 

» *2 = 

0.541284 


.0.22 

0.20 

0.40. 


_L 


.1 


Li 

J 


"0.990663“ 



"0.999707" 


"0.999991" 


*5 = 

0.504682 

• 

x 10 = 

0.500146 

X 15 = 


0.500005 



.1 



Li 

J 


Li 

J 



Here Ax 0 = [0.73 0.5 0.82] T , scaled to x 1 = [0.73/0.82 0.5/0.82 I ] T . etc. The dominant eigenvalue is 
0.72. an eigenvector [1 0.5 1 1 T . The corresponding q and 8 are computed each time before the next scaling. 
Thus in the first step. 


<1 = 


'»i 

m 0 


x 0 t Axq 

x 0 T *o 


2.05 

— - — = 0.683333 


8 = 


/ '”2 
\ w o 


\i/2 / (Ax 0 ) T Ax 0 

/ \ x 0 T x 0 




= 0.134743. 


This gives the following values of q< 5, and the error € = 0.72 — q (calculations with 10D, rounded to 6D): 


j 

l 

2 

5 

10 

q 

0.683333 

0.716048 

0.719944 

0.720000 

5 

0.134743 

0.038887 

0.004499 

0.000141 

€ 

0.036667 

0.003952 

0.000056 

5 • 10~ 8 


The error bounds are much larger than the actual errors. This is typical, although the bounds cannot be improved: 
that is, for special symmetric matrices they agree with the errors. 

Our present results are somewhat better than those of Collatz’s method in Example 4 of Sec. 20.7, at the 
expense of more operations. M 

Spectral shift, the transition from A to A — kl 9 shifts every eigenvalue by —k. Although 
finding a good k can hardly be made automatic, it may be helped by some other method 
or small preliminary computational experiments. In Example 1, Gerschgorin’s theorem 
gives —0.02 ^ A ^ 0.82 for the whole spectrum (verify!). Shifting by —0.4 might be too 
much (then —0.42 ^ A ^ 0.42), so let us try -0.2. 

Power Method with Spectral Shift 

For A - 0.21 with A as in Example 1 we obtain the following substantial improvements (where the index 1 
refers to Example I and the index 2 to the present example). 


j 

1 

2 

5 

10 

Si 

0.134743 

0.038887 

0.004499 

0.000141 

^2 

0.134743 

0.034474 

0.000693 

1.8- 10 -6 


0.036667 

0.003952 

0.000056 

o 

00 

*2 

0.036667 

0.002477 

1.3 * 10“ 6 

9 • 10“ 12 
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PROBLEM SET 20.8 


[7^] POWER METHOD WITH SCALING 

Apply the power method (3 steps) with scaling, using 
Xo = [1 1] T or [1 1 If or [1 1 1 if, as 

applicable. Give Rayleigh quotients and error bounds. 
Show the details of your work. 


1 . 

"3.5 

2.0" 

ro.6 

2. 

0.8" 


-2.0 

0.5- 

Lo.8 

-0.6. 




“-2 

2 

3 " 



“ 2 

-1 

r 


4. 

2 

1 

6 


5. 

-1 

3 

2 



_ 3 

6 

- 2 . 



_ 1 

2 

3 . 



'0 

4 

0 

r 


" 5 

1 

0 

0 “ 


4 

-1 

2 

8 


1 

3 

1 

0 

6. 

0 

2 

3 

2 

7. 

0 

1 

3 

1 


_1 

8 

2 

- 2 . 


.0 

0 

1 

5 . 


8. (Optimality of 5) In Prob. 2 choose x 0 = [3 — l] T 
and show that q = 0 and 8 = 1 for all steps and that the 
eigenvalues are ± 1, so that the interval [q - 8, q + 6] 
cannot be shortened in general! Experiment with 
other x 0 . 


9. Prove that if x is an eigenvector, then 8 = 0 in (2). 
Give two examples. 

10. (Rayleigh quotient) Why does q generally 
approximate the eigenvalue of greatest absolute value? 
When will q be a good approximation? 

11. (Spectral shift, smallest eigenvalue) In Prob. 5 set 
B = A — 31 (as perhaps suggested by the diagonal 
entries) and try whether you may get a sequence of q's 
converging to an eigenvalue of A that is smallest (not 
largest) in absolute value. Use x 0 = [1 I l] T . Do 
8 steps. Verify that A has the spectrum {0, 3, 5). 

12. CAS EXPERIMENT. Power Method with Scaling. 

Shifting, (a) Write a program for n X n matrices that 
prints every step. Apply it to the (nonsymmetric!) 
matrix (20 steps), starting from [1 1 1] T . 



15 

12 

3" 

A = 

18 

44 

18 


.-19 

-36 

—7. 


(b) Experiment in (a) with shifting. Which shift do you 
find optimal? 

(c) Write a program as in (a) but for symmetric 
matrices that prints vectors, scaled vectors, q , and 8. 
Apply it to the matrix in Prob. 6. 

(d) Find a (nonsymmetric) matrix for which 8 in (2) 
is no longer an error bound. 

(e) Experiment systematically with speed of 
convergence by choosing matrices with the second 
greatest eigenvalue (i) almost equal to the greatest, (ii) 
somewhat different, (iii) much different. 


20.S Tridiagonalization and QR-Factorization 

We consider the problem of computing all the eigenvalues of a real symmetric matrix 
A = [a jk ], discussing a method widely used in practice. In the first stage we reduce the 
given matrix stepwise to a tridiagonal matrix, that is, a matrix having all its nonzero 
entries on the main diagonal and in the positions immediately adjacent to the main diagonal 
(such as A 3 in Fig. 447, Third Step). This reduction was invented by A. S. Householder 
(7. Assn. Comput. Machinery 5 (1958), 335-342). See also Ref. [E29] in App. 1. 

This Householder tridiagonalization will simplify the matrix without changing its 
eigenvalues. The latter will then be determined (approximately) by factoring the 
tridiagonalized matrix, as discussed later in this section. 
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Householder's Tridiagonalization Method 

An nX n real symmetric matrix A = [cij k ] being given, we reduce it by /t — 2 successive 
similarity transformations (see Sec. 20.6) involving matrices P 1# • • * , P n _ 2 to tridiagonal 
form. These matrices are orthogonal and symmetric. Thus P* 1 = P X T = P x and similarly 
for the others. These transformations produce from the given A 0 = A = [aj k ] the matrices 
A i = [«$*]. A 2 = [ajfc]. • • • > K -2 = [«*5T 2> ] in the form 

Ai = PjAqPi 
A 2 = P 2 AjP 2 


® A n _2 Pn-2^n-3^tt-2‘ 

The transformations (1) create the necessary zeros, in the first step in Row 1 and Column 
1, in the second step in Row 2 and Column 2, etc., as Fig. 447 illustrates for a 5 X 5 
matrix. B is tridiagonal. 

How do we determine P 1? P* • • • P«-2? Now, all these P r are of the form 

(2) P r = I - 2v r v r T (r = 1, • • • , n - 2) 

where I is the n X n unit matrix and v r = [Vj r ] is a unit vector with its first r components 
0; thus 


"o' 


" 0 " 


O 

* 


0 


0 

❖ 

< 

N 

II 

* 

> » v n— 2 

* 







where the asterisks denote the other components (which will be nonzero in general). 
Step 1. v x has the components 


(4) 



On = 0 

(a) 

■ Vt (' + ¥) 

(b) 

__ a jl s & n a 21 

V]1 2v 21 s 1 


where 

(c) 

■$i = Vcr 21 2 + a 31 2 + • • • + a nl z 


where Si > 0, and sgn a 2 \ — +1 if <z 21 = 0 and sgn a 21 = — 1 if a 21 < 0. With this we 
compute P a by (2) and then by (1). This was the first step. 
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EXAMPLE 1 


First Step Second Step Third Step 

Aj = Pj APj = P 2 AjP 2 A 3 = P 3 A 2 P 3 

Fig. 447. Householder's method for a 5 X 5 matrix. 
Positions left blank are zeros created by the method. 


Step 2. We compute v 2 by (4) with all subscripts increased by 1 and the cij k replaced by 
aj]?, the entries of A x just computed. Thus [see also (3)] 


V 12 — t>22 “ 0 


(4*) 


where 




Vj2 = 


( 1 ) ( 1 ) 
_ a j2 sgn a 32 


2 ^ 32^2 


j = 4, 5, • • • , n 


=vw 


4sf + «$' + • • • + atf 


ci r 


With this we compute P 2 by (2) and then A 2 by (1). 

Step 3 . We compute v 3 by (4*) with all subscripts increased by 1 and the af k replaced 
by the entries ci™ of A 2 , and so on. 

Householder Tridiagonalization 

Tridiagonalize the real symmetric matrix 


A = Ao = 


Solution . Step 1. We compute S 2 = 4 2 -h l 2 + I 2 = 18 from (4c). Since a 2 i = 4 > 0, we have 
sgn a 2 1 = + 1 in (4b) and get from (4) by straightforward computation 


vi = 


"0 ■ 


‘0 

*>21 


0.985 598 56 

”31 


0.119 573 16 

- y 41_ 


^0.1 19 573 16_ 


From this and (2), 


Pi = 


-0.942 809 04 
-0.235 702 27 
-0.235 702 27 


0 

-0.235 702 27 
0.971404 52 
—0.028 595 48 


-0.235 702 27 
-0.028 595 48 
0.971 40452 
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From the first line in ( I) we now get 



" 6 

-Vl8 

0 

0' 

- PiAoPa - 

-VT8 

7 

-1 

-1 

0 

-1 

9/2 

3/2 


0 

-1 

3/2 

9/2 


Step 2. From (4*) we compute S 2 2 — 2 and 


v 2 = 


'0 " 


"0 

0 


0 

*>32 


0.923 879 53 

_ y 42_ 


_0.382 683 43_ 


From this and (2), 


P 2 


The second line in (1) now gives 


B — A 2 — P 2 AjP 2 — 


0 

I 

0 

0 

6 

-Vl8 

0 

0 


0 

0 

-1/V2 

-lV2 

— vT8 

7 

V2 

0 


0 ‘ 
0 

-1/V2 

1/V2 

0 

V2 

6 

0 


This matrix B is tridiagonal. Since our given matrix has order n = 4. we needed n — 2 = 2 steps to accomplish 
this reduction, as claimed. (Do you see that we got more zeros than we can expect in general?) 

B is similar to A, as we now show in general. This is essential because B thus has the same spectrum as A, 
by Theorem 2 in Sec. 20.6. M 


B Similar to A. We assert that B in (1) is similar to A = A 0 . The matrix P r is symmetric; 
indeed, 

P r T = (I - 2v r v r T ) T = I T - 2(v r v/) T = I - 2v r v r T = 

Also, P r is orthogonal because v r is a unit vector, so that v r T v r = 1 and thus 

P r P r T = P r 2 = (I — 2v r v r T ) 2 = I — 4v r v r T + 4v r v r T v r v r T 

= I — 4v r v r T + 4v r (v r T v r )v r T = I. 

Hence P r 1 = P r T = P r and from (1) we now obtain 


P?i— 2 ^n— 3 P«.— 2 


^72.-2^71-3 * * " PlAPl " ' 

* Pn- 3 Pn-2 

K-zKiz ■ Pr'AP: 

f*n-3 2 

P _1 AP 



where P — P X P 2 • • • P w _ 2 . This proves our assertion. 
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QR-Factorization Method 

In 1958 H. Rutishauser of Switzerland proposed the idea of using die LU-factorization 
(Sec. 20.2; he called it LR-factorization) in solving eigenvalue problems. An improved 
version of Rutishauser’ s method (avoiding breakdown if certain submatrices become 
singular, etc.; see Ref. [E29D is the QR-method, independently proposed by the American 
J. G. F. Francis ( Computer J. 4 (1961-62), 265-271, 332-345) and the Russian 
V. N. Kublanovskaya ( Zhurncil Vych. Mat. i Mat. Fiz. 1 (1961), 555-570). The QR-method 
uses the factorization QR with orthogonal Q and upper triangular R. We discuss the 
QR-method for a real symmetric matrix. (For extensions to general matrices see Ref. [E29] 
in App. 1.) 

In this method we first transform a given real symmetric n X n matrix A into a 
tridiagonal matrix B 0 = B by Householder's method. This creates many zeros and thus 
reduces the amount of further work. Then we compute B^ B 2 , • • * stepwise according to 
the following iteration method. 

Step L Factor B 0 = Q 0 Ro with orthogonal R 0 and upper triangular R 0 . Then compute 

= RoQo 

Step 2. Factor B x = Q]Ri. Then compute B 2 = RiQi- 
General Step s + I. 


(5) 


(a) Factor B s = Q $ R S . 

(b) Compute B s+1 = R S Q S . 


Here Q s is orthogonal and R s upper triangular. The factorization (5a) will be explained 
below. 

B s+I Similar to B. Convergence to a Diagonal Matrix. From (5a) we have R s = Qj l B s . 
Substitution into (5b) gives 

(6) B s+1 = R S Q S = Q^B.Q,. 


Thus B s +i is similar to B s . Hence B s + X is similar to B 0 = B for all s. By Theorem 2, 
Sec. 20.6, this implies that B s+1 has the same eigenvalues as B. 

Also, B s . h1 is symmetric. This follows by induction. Indeed, B 0 = B is symmetric. 
Assuming B s to be symmetric, that is, B S T = B s , and using Qj 1 = Q $ T (since Q s is 
orthogonal), we get from (6) the symmetry, 

B s+ x T = (Q S T B S Q,) T = Q/B/Q, = Q/B S Q S = B s+1 . 


If the eigenvalues of B are different in absolute value, say, |A X | > |A 2 | > • • • > |A rt |, 
then 


lim B s = D 

S— *30 


where D is diagonal, with main diagonal entries A : , A 2 , • • • , A*. (Proof in Ref. [E29] 
listed in App. 1.) 
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How to Get the QR-Factorization, say, B = B 0 = [bj^\ = Q 0 Ro- The tridiagonal 
matrix B has n — 1 generally nonzero entries below tire main diagonal. These are 
Z?2i> 632, * * * * & n ,n-i- We multiply B from the left by a matrix C 2 such that C 2 B = \bff\ 
has bfl = 0. We multiply this by a matrix C 3 such that C 3 C 2 B = [bff\ has b^ — 0, etc. 
After n — 1 such multiplications we are left with an upper triangular matrix R 0 , namely, 


(7) 


C n C n _j 


C3C2B0 — Rq. 


These n X n matrices G, are very simple. Cj has the 2 X 2 submatrix 

cos Oj sin Of 

—sin Oj cos Oj 


(Oj suitable) 


in Rows j — 1 and j and Columns /* — 1 and j; everywhere else on the main diagonal the 
matrix C,- has entries 1 ; and all its other entries are 0. (This submatrix is the matrix of a 
plane rotation through the angle Of, see Team Project 28, Sec. 7.2.) For instance, if 
n = 4, writing Cj = cos 0j, Sj — sin 0j, we have 


t*2 

^2 

0 

0" 


'1 

0 

0 

0“ 


“1 

0 

0 

0“ 

~ S 2 

^2 

0 

0 

p 

II 

0 

c 3 

H 

0 

. c 4 = 

0 

1 

0 

0 

0 

0 

1 

0 


0 

-s* 

C 3 

0 


0 

0 

C4 

S 4 

. 0 

0 

0 

L 


_0 

0 

0 

1. 


.0 

0 

-s 4 

^4- 


These are orthogonal. Hence their product in (7) is orthogonal, and so is the inverse 
of this product. We call this inverse Q 0 . Then from (7), 

(8) B 0 = Q 0 Ro 
where, with C” 1 = Cf, 

(9) Qo = (C n C n _! • • • C 3 C 2 ) -1 = C 2 t C 3 t • • • C n _ 1 T Cj. 

This is our QR-factorization of B 0 . From it we have by (5b) with .s = 0 

(10) B x = RoQo = R 0 C 2 t C 3 t • • • C n _ 1 T C n T . 

We do not need Q 0 explicitly, but to get B x from (10), we first compute RoC2 T > then 
(R 0 C 2 t )C 3 t , etc. Similarly in the further steps that produce B 2 , B 3 , • • • . 

Determination of cos 0j and sin 0j. We finally show how to find the angles of rotation, 
cos 0 2 and sin 0 2 hi C 2 must be such that h™ = 0 in the product 


" c 2 

*2 

0 


’*11 

*12 

*13 


-s 2 

c 2 

0 


*21 

*22 

*23 

. . . 

. 

. 

• 







C 2 B = 
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EXAMPLE 2 


Now b^i is obtained by multiplying the second row of C 2 by the first column of B, 
bfi = ~s 2 b n + c 2 b 2 i = -(sin 0 2 )b u + (cos 6 2 )b 21 = 0. 

Hence tan d 2 = s 2 lc 2 = b 21 /b n , and 
cos 0 2 = 


1 


I 


( 11 ) 


sin 0 2 = 


vT + tan 2 0 2 Vl + (b 2 i/bu) 2 

tan $2 ^ 21^11 


Similarly for 0 3 , 0 4 , 


VT 4 - tan 2 e 2 Vi + (i>2#n ) 2 ’ 
The next example illustrates all this. 


QR-Factorization Method 

Compute all the eigenvalues of the matrix 

A = 


Solution . We first reduce A to tridiagonal form. Applying Householder’s method, we obtain (see Example 1) 

6 -Vl8 0 0l 

— Vl8 7 V2 0 | 

0 V5 6 0 

L 0 0 0 3. 


A 2 = 


From the characteristic determinant we see that A 2 , hence A, has the eigenvalue 3. (Can you see this directly 
from A 2 ?) Hence it suffices to apply the QR-method to the tridiagonal 3x3 matrix 


B n = B = 


Step i. We multiply B from the left by 


6 — VT8 

-Vl8 7 
L 0 V5 


0 

V2 

6 . 



" cos 0 2 

sin 0 2 

0“ 



"1 0 

0 ' 

c 2 = 

—sin 0 2 
. 0 

cos 0 2 
0 

0 

1. 

and then C 2 B by 

C 3 = 

0 cos 0 3 
_0 -sin 0 3 

sin 05 
cos 0 3 _ 


Here (-sin fr 2 ) ' 6 + (cos0 2 X~vT8) = 0 gives (II) cos 0 2 = 0.816 496 58 and sin 6 Z = -0.57735027. 
With these values we compute 

f 7.348 469 23 -7.505 553 50 -0.816 496 581 


C 2 B = 


3.265 986 32 
1.414213 56 


1.154700 54 

6.000 000 00 J 


In C 3 we get from (- sin 0 3 ) • 3.265 986 32 -f (cos 0 3 ) • 1.414 213 56 = 0 the values cos 0^ = 0.917 662 94 
and sin 6$ = 0.397 359 7 1 . This gives 


Ro — c 3 c 2 b — 


7.348 469 23 -7.505 553 50 -0.816 496 58 

0 3.559 026 08 3.443 784 13 

L0 0 5.047 146 15 J 
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From this we compute 


Bi - R 0 C 2 t C 3 t — 


10.333 333 33 
-2.054 804 67 
0 


2.054 804 67 
4.035 087 72 
2.005 532 51 


0 

2.005 532 51 
4.631 578 95. 


which is symmetric and tridiagonal. The off-diagonal entries in are still large in absolute value. Hence we 
have to go on. 

Step 2. We do the same computations as in the first step, with B 0 = B replaced by Bj and C 2 and C 3 changed 
accordingly, the new angles being 0 2 = —0.1 96 291 533 and 0 3 = 0.513 415 589. We obtain 


and from this 


R i 


'10.535 653 75 
0 

_ 0 


-2.802 322 41 
4.083 295 84 
0 


-0.391 145 88' 
3.988 240 28 
3.068 326 68 _ 


b 2 = 


' 10.879 879 88 
-0.796 379 18 
. 0 


-0.796 379 18 
5.447 386 64 
1.507 025 00 


0 

1.507 025 00 . 
2.672 733 48. 


We see that the off-diagonal entries are somewhat smaller in absolute value than those of B x . but still much too 
large for the diagonal entries to be good approximations of the eigenvalues of B. 

Further Steps. We list the main diagonal entries and the absolutely largest off-diagonal entry, which is 
\b\ 2 \ = \^2i \ ,n a *l steps. You may show that the given matrix A has the spectrum 1 1, 6, 3, 2. 


Step j 



^33 

max^ fc |i$*| 

3 

10.966 892 9 

5.945 898 56 

2.087 208 51 

0.585 235 82 

5 

10.997 087 2 

6.001 815 41 

2.001 097 38 

0.120 653 34 

7 

10.999 742 1 

6.000 244 39 

2.000 013 55 

0.035 91 1 07 

9 

10.999 977 2 

6.000 022 67 

2.000 000 17 

0.010 684 77 


Looking back at our discussion, we recognize that the purpose of applying Householder’s 
tridiagonalization before the QR- factorization method is a substantial reduction of cost in 
each QR-factorization, in particular if A is large. 

Convergence acceleration and thus further reduction of cost can be achieved by a 
spectral shift, that is, by taking B s — kj. instead of B $ with a suitable k s . Possible choices 
of k s are discussed in Ref. [E29], p. 510. 


» m:omrM -SE3:r=2n=9 - — : 


[N4] householder tridiagonalization 

Tridiagonalize, showing the details: 



"3.5 

1.0 

1.5" 


1. 

1.0 

5.0 

3.0 



_ 1.5 

3.0 

3.5. 



["0.98 

0.04 

0.44' 

3. 

0.04 

0.56 

0.40 


.0.44 

0.40 

0.80. 


1 

1 

0 


f8 8 2 21 


4. 


8 

2 


8 

2 


L2 2 


2 2 

6 4 

4 6_ 


5-9 


QR-FACTORIZATION 


Do three QR-steps to find approximations of the 
eigenvalues of: 


5. The matrix in the answer to Prob. 1 
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6. The matrix in the answer to Prob. 3 


f7.0 0.1 01 


f 18 2 0“1 


9. 


0.1 


4.0 0.1 


7. 


2 8 2 


0 0.1 1 . 0 . 



Lo 

2 

2j 


‘ 16.2 

-0.1 

0 

8. 

-0.1 

-4.3 

0.2 


_ 0 

0.2 

4.1. 


10. CAS EXPERIMENT. QR-Method. Try to find out 
experimentally on what properties of a matrix the speed 
of decrease of off-diagonal entries in the QR-method 
depends. For this purpose write a program that first 
tridiagonalizes and then does QR-steps. Try the 
program out on the matrices in Probs. 1, 3, and 4. 
Summarize your findings in a short report. 


CHAPTER 20 REVIEW QUESTIONS AND PROBLEMS 


1. What are the main problem areas in numeric linear 
algebra? 

2. What is pivoting? When and how would you apply it? 

3. What happens if you apply Gauss elimination to a 
system that has no solutions? 

4. What is Doolittle’s method? Its connection to Gauss 
elimination? 

5. What is Cholesky’s method? When would you apply it? 

6. What do you know about the convergence of the 
Gauss-Seidel method? 

7. What is ill-conditioning? What is the condition number 
and its significance? 

8. What is least squares approximation? What are the 
normal equations? 


17. 


* 2 - 

*- A*3 = 

5 



*1 4* 2*2 - 

t- 2a- 3 = 

6 



X\ 4- 2*2 4- 3*3 = 

8 


18. 

5*! 4* 

*2 

- 3*3 

= 

17 


- 

5*2 

4* 15*3 

= 

-10 


2*i ” 

3*2 

4* 9*3 

= 

0 

19. 

2a-j 


4* 3*3 = 


15 



4*2 

- *3 = 

= - 

-13 


3a-i - 

*2 

4* 5*3 = 


26 


9. What is an eigenvalue of a matrix? Why are eigenvalue 
problems important? Give typical examples. 

10. Why are similarity transformations of matrices important 
in designing numeric methods? Give examples. 

11. What is the power method for eigenvalues? What are 
its advantages and disadvantages? 


20. Solve Prob. 17 by Doolittle’s method. 

21. Solve Prob. 17 by Cholesky’s method. 


22-24 INVERSE MATRIX 


Compute the inverse of: 


12. State Gerschgorin’s theorem from memory. Can you 


“1.0 

2.0 

0.5“ 

remember its proof? 

22. 

0.5 

1.0 

0.5 

13. State Schur’s inequality and give some applications of 





it. 

14. What is tridiagonalization? When would you apply it? 


-1.5 

2.0 

1.0. 

15. What is the idea of the QR-method? When would you 


“1.5 

2.0 

1.0“ 

apply the method? 

23. 

2.0 

3.5 

1.5 

16-19 GAUSS ELIMINATION 


.1.0 

1.5 

9.0. 

Solve: 

16. 4*2 — 3*3 = 1 1.8 


r 5 

i r 


5*! 4- 3*2 4- * 3 = 34.2 

24. 

1 

6 0 


6 *! — 7*2 4- 2*3 = —3.1 


J 

0 8. 
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25-26 1 GAUSS-SEIDEL METHOD 

Do 3 steps without scaling, starting from [1 l 1] T : 


25. X\ + 15*2 “ *3 = 1 1 

10*! 4- 3*2 = — 17 

2* x — *2 4 5*3 = 5 

26. 4*j - a - 2 = 5.5 

4*2 ~ *3 = 0.4 

~ x i 4 4*3 = 11.2 


27-32 1 VECTOR NORMS 

Compute the C r , C 2 and f^-norms of the vectors 

27. [0 4 -8 3] t 

28. [3 8 -Ilf 

29. [-4 1 0 2] t 

30. [0 0 1 Of 

31. [-5 -2 7 0 Of 

32. [0.3 1.4 0.2 — 0.6f 

33-35[ MATRIX NORM 

Compute the matrix norm corresponding to the vector 
norm for the coefficient matrix: 

33. In Prob. 17 


34. In Prob. 18 

35. in Prob. 19 

1 36-38 1 CONDITION NUMBER 

Compute the condition number (corresponding to the 
vector norm) of the coefficient matrix: 

36. In Prob. 22 

37. In Prob. 23 

38. In Prob. 24 

39-40 1 FITTING BY LEAST SQUARES 

Fit: 

39. A straight line to (-2, 0.1), (0, 1.9), (2, 3.8), (4. 6.1), 
(6, 7.8) 

40. A quadratic parabola to (1, 9), (2, 5), (3, 4), (4, 5), (5, 7) 

41-43 1 EIGENVALUES 

Find three circular disks that must contain all the eigenvalues 

of the matrix: 

41. In Prob. 22 

42. Tn Prob. 23 

43. In Prob. 24 

44. (Power method) Do 4 steps of the power method for 

the matrix in Prob. 24, starting from [1 1 if and 

computing the Rayleigh quotients and error bounds. 

45. (Householder and QR) Tridiagonalize the matrix in 
Prob. 23. Then apply 3 QR steps. (Spectrum (6S): 
9.65971,4.07684,0.263451) 


SUMMARY -OF CHAPTER 20 

Numeric Linear Algebra 


Main tasks are the numeric solution of linear systems (Secs. 20. 1-20.4), curve fitting 
(Sec. 20.5), and eigenvalue problems (Secs. 20.6-20.9). 

Linear systems Ax = b with A = [a jk ], written out 

E x : a n x x + * • * + a ln x n = b x 

^ 21 x 1 “I" ' * ’ "h #2 r?A*n = ^2 

( 1 ) 


E w . a ni x i t • • • 4- ci nn x n b n 

can be solved by a direct method (one in which the number of numeric operations 
can be specified in advance, e.g., Gauss’s elimination) or by an indirect or iterative 
method (in which an initial approximation is improved stepwise). 
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The Gauss elimination (Sec. 20. 1 ) is direct, namely, a systematic elimination 
process that reduces (1) stepwise to triangular form. In Step 1 we eliminate*! from 
equations E 2 to E n by subtracting (a 2 i/a n ) E x from E 2 , then (a 31 /a u ) E x from E 3 , 
etc. Equation E 2 is called the pivot equation in this step and a u the pivot. In Step 
2 we take the new second equation as pivot equation and eliminate * 2) etc. If the 
triangular form is reached, we get *„ from the last equation, then jc n _ x from the 
second last, etc. Partial pivoting (= interchange of equations) is necessary if 
candidates for pivots are zero, and advisable if they are small in absolute value. 

Doolittle’s, Crout’s, and Cholesky’s methods in Sec. 20.2 are variants of the 
Gauss elimination. They factor A = LU (L lower triangular, U upper triangular) 
and solve Ax = LUx = b by solving Ly = b for y and then Ux = y for x. 

In the Gauss-Seidel iteration (Sec. 20.3) we make a n = r / 22 = • • • = a nn = 1 
(by division) and write Ax = (I + L 4- U)x = b; thus x = b — (L + U)x, which 
suggests the iteration formula 

(2) x (m+1) = b - Lx (w+1) - Ux (m) 


in which we always take the most recent approximate x/s on the right. If ||C|| < 1, 
where C = -(I + L) _1 U, then this process converges. Here, ||C|| denotes any 
matrix norm (Sec. 20.3). 

If the condition number k(A) = ||A|| ||A _1 || of A is large, then the system 
Ax = b is ill-conditioned (Sec. 20.4), and a small residual r = b - Ax does not 
imply that xis close to the exact solution. 

The fitting of a polynomial p(x) = b 0 4- b x x + • • • + b m x m through given data 
(points in the Ay-plane) (x x , )\), ■ ■ • , (x n , y n ) by the method of least squares is 
discussed in Sec. 20.5 (and in statistics in Sec. 25.9). 

Eigenvalues A (values A for which Ax = Ax has a solution x =£ 0, called an 
eigenvector) can be characterized by inequalities (Sec. 20.7), e.g. in Gerschgorin’s 
theorem, which gives n circular disks which contain the whole spectrum (all 
eigenvalues) of A, of centers ajj and radii (sum over k from 1 to /?, k =£ j). 

Approximations of eigenvalues can be obtained by iteration, stalling from an 
x 0 0 and computing x x = Ax 0 , x 2 = Ax x , • • • , x n = Ax n _ x . In this power 
method (Sec. 20.8) the Rayleigh quotient 


(3) 


(Ax) t x 

ex-*,,) 


gives an approximation of an eigenvalue (usually that of the greatest absolute value) 
and, if A is symmetric, an error bound is 


(4) 


(Ax) T Ax 


x T x 


Convergence may be slow but can be improved by a spectral shift. 

For determining all the eigenvalues of a symmetric matrix A it is best to first 
tridiagonalize A and then to apply the QR-method (Sec. 20.9), which is based on a 
factorization A = QR with orthogonal Q and upper triangular R and uses similarity 
transformations. 






CHAPTER 2 1 

Numerics for ODEs and PDEs 


Numeric methods for differential equations are of great practical importance to the 
engineer and physicist because practical problems often lead to differential equations that 
cannot be solved by one of the methods in Chaps. 1-6 or 12 or by similar methods. Also, 
sometimes an ODE does have a solution formula (as the ODEs in Secs. 1 .3-1.5 do), which, 
however, in some specific cases may become so complicated that one prefers to apply a 
numeric method instead. 

This chapter explains and applies basic methods for the numeric solution of ODEs (Secs. 
21.1-21.3) and PDEs (Secs. 21.4-21.7). 

Sections 21.1 and 21.2 may be studied immediately after Chap. 1 and Sec. 21.3 
immediately after Chap. 2, because these sections are independent of Chaps. 19 and 20. 

Sections 21.4-21.7 on PDEs may be studied immediately after Chap. 12 if students 
have some knowledge of linear systems of algebraic equations. 

Prerequisite: Secs. 1. 1-1.5 for ODEs, Secs. 12.1-12.3, 12.5, 12.10 for PDEs. 

References and Answers to Problems App. 1 Part E (see also Parts A and C), App. 2. 


21.1 Methods for First-Order ODEs 

From Chap. 1 we know that an ODE of the first order is of the form F( a\ >\ /) = 0 and 
can often be written in the explicit form / = f(x , y). An initial value problem for this 
equation is of the form 

(1) / = f(x, y), y(x 0 ) = v 0 

where a 0 and y 0 are given and we assume that the problem has a unique solution on some 
open interval a < x < b containing x 0 . 

In this section we shall discuss methods of computing approximate numeric values of 
the solution y(.v) of (1) at the equidistant points on the x-axis 

x i — *o + /?, x 2 = A'o + 2/7, A' 3 = x 0 + 3/7. 

where the step size h is a fixed number, for instance, 0.2 or 0.1 or 0.01, whose choice we 
discuss later in this section. Those methods are step-by-step methods, using the same 
formula in each step. Such formulas are suggested by the Taylor series 

h 2 

(2) y( x + h) = y(A) + A/C a) + — /'(A) + • • • . 
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For a small h the higher powers /? 2 , A 3 , * • • are very small. This suggests the crude 
approximation 

v( x 4* A) « y(x) + hy'( x) 

= v(a) 4 hf(x, y) 

(with the second line obtained from the given ODE) and the following iteration process. 
In the first step we compute 

= .Vo + hfU o» Vo) 


which approximates y(x x ) = y(x 0 4 A). In the second step we compute 

y 2 = yi + hf(x i, y x ) 

which approximates ,y(x 2 ) = y(x Q 4 2 A), etc., and in general 

(3) Vn+l yn Vn ) ~ 0, 1, * * *)• 

This is called the Euler method or the Euler-Cauchy method. Geometrically it is an 
approximation of the curve of y(A*) by a polygon whose first side is tangent to this curve 
at x 0 (see Fig. 448). 



This crude method is hardly ever used in practice, but since it is simple, it nicely explains 
the principle of methods based on the Taylor series. 

Taylor’s formula with remainder has the form 


y(x + h) = yW + liy'(x) + §/?y'(£) 


(where x ^ x 4 h). It shows that in the Euler method the truncation error in each 
step or local truncation error is proportional to h 2 , written 0(h 2 ), where O suggests order 
(see also Sec. 20. 1 ). Now over a fixed A-interval in which we want to solve an ODE the 
number of steps is proportional to l//z. Hence the total error or global error is proportional 
to h 2 (\/h) = A 1 . For this reason, the Euler method is called a first-order method. In 
addition, there are roundoff errors in this and other methods, which may affect the 
accuracy of the values y x , y 2 , * • • more and more as n increases, as we shall see. 
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EXAMPLE 1 


EXAMPLE 2 


Table 21.1 Euler Method Applied to (4) in Example 1 and Error 


n 

*«, 

Vrt 

0.2(x„. + y n ) 

Exact 

Values 

Error ^ 

0 

0.0 

0.000 

0.000 

0.000 

0.000 

1 

0.2 

0.000 

0.040 

0.021 

0.021 

2 

0.4 

0.040 

0.088 

0.092 

0.052 

3 

0.6 

0.128 

0.146 

0.222 

0.094 

4 

0.8 

0.274 

0.215 

0.426 

0.152 

5 

1.0 

0.489 


0.718 

0.229 


Euler Method 

Apply the Euler method to the following initial value problem, choosing h = 0.2 and computing , y 5 : 

(4) y' = * + >\ y(0) = 0. 

Solution . Here f(x, y) = x 4 y; hence f(x n , y n ) = x n 4 y n , and we see that (3) becomes 


Jn+ 1 = ) ? n 4* 0.2 (a^ 4 y w ). 


Table 21.1 shows the computations, the values of the exact solution 

y{x) = e x - x - 1 

obtained from (4) in Sec, 1.5, and the error. Tn practice the exact solution is unknown, but an indication of the 
accuracy of the values can be obtained by applying the Euler method once more with step 2h = 0.4, letting y w * 
denote the approximation now obtained, and comparing corresponding approximations. This computation is: 



>’n* 

0.4(x„ + y n ) 

3 V in Table 21.1 

Difference y n — y n * 

0.0 

0.000 

0.000 

0.000 

0.000 

0.4 

0.000 

0.160 

0.040 

0.040 

0.8 

0.160 


0.274 

0.114 


Let €f t and e n * be the errors of the computations with h and 2 h, respectively. Since the error is of order ft 2 , 
in a switch from h to 2h it is multiplied by 2 2 = 4, but since we need only half as many steps as before, it 
will be multiplied only by 4/2 = 2. Hence e n * « 2^ so that the difference is e^* - *= 2e n - — e n . 

Now y = y n 4 € n = y n * 4- e^* by the definition of error; hence — e* = y n - y n * indicates € n 
qualitatively. In our computations. y z - yz* — 0.04 — 0 = 0.04 (actual error 0.052, see Table 21,1) and 
>’4 “ >’4* = 0.274 - 0.160 = 0.114 (actually 0,152), ■ 

Euler Method for a Nonlinear ODE 

Figure 449 concerns the initial value problem 

(5) y f = (y — 0.0 l.v 2 ) 2 sin (a 2 ) 4 0.02 a*, y(0) = 0.4 

and shows the curve of the solution y = l/[2.5 — S(.v)J 4- 0.01a 2 where $(.v) is the Fresnel integral (38) in 
App. 3.1. It also shows 80 approximate values for 0 ^ x ^ 4 obtained by the Euler method from (3), 

■Vn+1 = y»i + 0.05 [(v„ - 0.0 1.0 2 sin (a„ 2 ) + 0.02A n ]. 

Although ft = 0.05 is smaller than ft in Example I, the accuracy is still not good. It is interesting that the error 
is not monotone increasing, obviously since the solution is not monotone. We shall return to this ODE in the 
problem set. g 
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y 

0.70 
0.60 
0.50 
0.40 

01 1 2 3 4 * 

Fig. 449. Solution curve and Euler approximation in Example 2 

Automatic Variable Step Size Selection in Modern Numeric Software 

The idea of adaptive integration as motivated and explained in Sec. 19.5 applies equally 
well to the numeric solution of ODEs. It now concerns automatically changing the step 
size h depending on the variability of y = f determined by 

( 6 *) /' = /'=/* + fy)' = fx + fyf* 

Accordingly, modern software automatically selects variable step sizes h n so that the 
error of the solution will not exceed a given maximum size TOL (suggesting tolerance). 
Now for the Euler method, when the step size is h = h n , the local error at x n is about 
\h r ? l/V n )l* We require that this be equal to a given tolerance TOL, 

(6) (a) ICI/WI = TOL, thus (b) h n = 

y"(x) must not be zero on the interval J: x 0 Sa = x n on which the solution is wanted. 
Let K be the minimum of | y"(.r)| on J and assume that K > 0. Minimum |y"(jc)j corresponds 
to maximum h = H = V2 TOL/A" by (6). Thus, V2 TOL = FT/k. We can insert this 
into (6b), obtaining by straightforward algebra 

(7) h n = (p{x n )H where <p(x n ) = 

For other methods, automatic step size selection is based on the same principle. 

Improved Euler Method 

By taking more terms in (2) into account we obtain numeric methods of higher order and 
precision. But there is a practical problem. If we substitute y' = f(x, )’(*)) into (2), we 
have 





(2*) 


y{x + h) = y(x) + Itf + \h z f + i h 3 f" + • • • . 
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Now y in f depends on a, so that we have f as shown in (6*) and f m even much 
more cumbersome. The general strategy now is to avoid the computation of these 
derivatives and to replace it by computing / for one or several suitably chosen auxiliary 
values of (a*, v). “Suitably” means that these values are chosen to make the order of 
the method as high as possible (to have high accuracy). Let us discuss two such methods 
that are of practical importance, namely, the improved Euler method and the (classical) 
Runge-Kutta method. 

In the improved Euler method or improved Euler-Cauchy method (sometimes also 
called Heun method), in each step we compute first the auxiliary value 

(8a) v*+i = y n + hf(x n , y n ) 

and then the new value 

(8b) .v«+i = y n + k h [/(•*»> yJ + /(*»+i. yS+i)]- 


This method has a simple geometric interpretation. In fact, we may say that in the 
interval from x n to x n 4- \h we approximate the solution y by the straight line through 
(a* 71 , v n ) with slope /( x n , y n ), and then we continue along the straight line with slope 
/(AVi+i, y$+ 1) until a* reaches x n + v 

The improved Euler-Cauchy method is a predictor-corrector method, because in each 
step we first predict a value by (8a) and then correct it by (8b). 

In algorithmic form, using the notations k 1 = hf(x n , y n ) in (8a) and k 2 = hf( x n+1 , y%+i) 
in (8b), we can write this method as shown in Table 21.2. 


Table 21.2 Improved Euler Method (Heun’s Method) 

ALGORITHM EULER (/, a 0 , y 0 > K N) 

This algorithm computes the solution of the initial value problem y' = f(x\ y), v(a* 0 ) = y 0 
at equidistant points x x = .v 0 4 /i, x 2 = x 0 4- 2/z, • • • , x N = a 0 4 Nil ; here / is such 
that this problem has a unique solution on the interval [a' 0? x n ] (see Sec. 1.7). 

INPUT: Initial values a 0 , .Voi ste P hi number of steps N 

OUTPUT: Approximation y n+l to the solution y(* w+1 ) at A* n+1 = a 0 4 (/i 4 1 )/?, 
where n = 0, • • • , N — 1 

For n = 0, 1, • • • , N — 1 do: 

A’n + l — A' n 4 h 
hi = hf(x n , y n ) 
h = hf(x n+ 1 , y n 4 k x ) 
v »+ 1 = Vn + |(^1 + k 2 ) 

OUTPUT A n+ 1 ,v „ +1 

End 

Stop 

End EULER 
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EXAMPLE 3 


PROOF 


Improved Euler Method 

Apply the improved Euler method to the initial value problem (4). choosing h = 0.2, as before. 
Solution . For the present problem we have in Table 21.2 


*1 = 0-2(-V„ + >' n ) 

*2 = 0.2(jc n + 0.2 + y n + 0.2 + y n )) 

.Vt»+i = y n + (2.2a m + 2.2v n + 0.2) = y* -1- 0.22 (*„ + y n ) + 0.02. 

Table 21.3 shows that our present results are more accurate than those in Example 1; see also Table 21.6. I 


Table 21.3 Improved Euler Method Applied to (4) and Error 


n 

x n 

3 } n 

0.22(x„ + y n ) 
+ 0.02 

Exact Values 
(4D) 

Error 

0 

0.0 

0.0000 

0.0200 

0.0000 

0.0000 

1 

0.2 

0.0200 

0.0684 

0.0214 

0.0014 

2 


0.0884 

0.1274 



3 


0.2158 

0.1995 



4 


0.4153 

0.2874 

0.4255 

0.0102 

5 


0.7027 


0.7183 

0.0156 


Error of the Improved Euler Method. The local error is of order h 3 and the global 
error of order /? 2 . so that the method is a second-order method. 


Setting f n = f(x n , y(x n )) and using (2*), we have 

(9a) v(-v n + h) - y(x n ) = hf n + \h z f' n + + • • • . 

Approximating the expression in the brackets in (8b) by f n + f n+1 and again using the 
Taylor expansion, we obtain from (8b) 

Ju+l “ ) f n 5=55 2^ [fn + /n+l] 

(9b) = \h [/„ + C fn + hfn + hh Z f'n + ' ')] 

= hf n + \h 2 f' n + 4/r 3 /" + • • • 

(where ' = dfdx n , etc.). Subtraction of (9b) from (9a) gives the local error 





Since the number of steps over a fixed .v-interval is proportional to I /A, the global error 
is of order h 3 /h = /? 2 , so that the method is of second order. ■ 
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Runge-Kutta Methods (Rl< Methods) 

A method of great practical importance and much greater accuracy than that of the 
improved Euler method is the classical Runge-Kutta method of fourth order f which we 
call briefly the Runge-Kutta method * 1 It is shown in Table 21.4. We see that in each 
step we first compute four auxiliary quantities k 1} k 2 , * 3 , k 4 and then the new value y n+1 . 
The method is well suited to the computer because it needs no special starting procedure, 
makes light demand on storage, and repeatedly uses the same straightforward 
computational procedure. It is numerically stable. 

Note that if f depends only on x, this method reduces to Simpson’s rule of integration 
(Sec. 19.5). Note further that k l9 • • • , k 4 depend on n and generally change from step to 
step. 


Table 21.4 Classical Runge-Kutta Method of Fourth Order 

ALGORITHM RUNGE-KUTTA (/, a* 0 , y 0 , h, N). 

This algorithm computes the solution of the initial value problem y — f(x, y) 9 y(x 0 ) = y 0 
at equidistant points 

A'i = x 0 +• ** x 2 ~ x o + 21u • * * , x N = x Q + Nh; 

here f is such that this problem has a unique solution on the interval [jc 0> x N ] (see Sec. 1.7). 

INPUT: Function /, initial values a 0 , y 0 > ste P si ze K number of steps N 

OUTPUT: Approximation y n+1 to the solution y(jc n+1 ) at A' n+ i = a* 0 + (n + 1 )h, 
where n = 0, 1, • • • , N — 1 

For n = 0, 1, • • ■ , N — 1 do: 
ki = hf(x n > y n ) 
k 2 = hf \x n + \K y n + \k x ) 
k 3 = hf(x n + |/t, y n + \k 2 ) 
k 4 = hf(x n + h y y n + * 3 ) 
x n+ 1 = x n + h 

y n +l = >n + g(*l + 2* 2 + 2*3 + * 4 ) 

OUTPUT A n+1 , y n+ i 

End 

Stop 

End RUNGE-KUTTA 


1 Named after the German mathematicians KARL RUNGE (Sec. 19.4) and WILHELM KUTTA (1867-1944). 
Runge [Math. Annalen 46 (1895), 167-178], KARL HEUN [Zeitschr. Math. Phys. 45 (1900), 23-38], and 
Kutta [Zeitschr. Math. Phys. 46 1901), 435—453] developed various such methods. Theoretically, there are 
Infinitely many fourth-order methods using four function values per step. The method in Table 21.4 is most 
popular from a practical viewpoint because of its “symmetrical” form and its simple coefficients. It was given 
by Kutta. 
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EXAMPLE 4 Classical Runge-Kutta Method 

Apply the Runge-Kutta method to the initial value problem (4) in Example l, choosing h = 0.2, as before, and 
computing five steps. 

Solution . For the present problem we have /(.v, y) = x 4 y. Hence 

*1 = 0.2(a’ h + y n ), k 2 = 0.2(a w 4 0.1 4 y n 4 0.5^), 

*3 = 0.2(A’ n 4 0.1 4 V n + 0.5*2 ). ^4 = 0*2(A‘ n 4 0.2 4 y n 4 * 3 ). 

Table 21.5 shows the results and their errors, which are smaller by factors 10 3 and 10 4 than those for the two 
Euler methods. See also Table 21.6. We mention in passing that since the present k x . • • • , k 4 are simple, 
operations were saved by substituting ki into k 2 , then k 2 into * 3 . etc.; the resulting formula is shown in Column 
4 of Table 21.5. ■ 


Table 21.5 Runge-Kutta Method Applied to (4) 


n 

x n 

yn 

0.2214(x n + y n ) 
+ 0.0214 

Exact Values (6D) 
y = e x - x - 1 

10 6 X Error 

ofy« 

0 

0.0 

0 

0.021 400 

0.000 000 

0 

1 

0.2 

0.021 400 

0.070 418 

0.021 403 

3 

2 

0.4 

0.091 818 

0.130 289 

0.091 825 

7 

3 

0.6 

0.222 107 

0.203 414 

0.222 119 

12 

4 

0.8 

0.425 521 

0.292 730 

0.425 541 

20 

5 

1.0 

0.718 251 


0.718 282 

31 


Table 21.6 Comparison of the Accuracy of the Three Methods Under Consideration 
in the Case of the Initial Value Problem (4), with h = 0.2 





Error 


A' 

II 

C* 

1 

1 

Euler 

(Table 21.1) 

Improved Euler 
(Table 21.3) 

Runge-Kutta 
(Table 21.5) 

0.2 

0.021 403 

0.021 

0.0014 

0.000 003 

0.4 

0.091 825 

0.052 

0.0034 

0.000007 

0.6 

0.222 119 

0.094 

0.0063 

0.000011 

0.8 

0.425 541 

0.152 

0.0102 

0.000 020 

1.0 

0.718 282 

0.229 

0.0156 

0.000 031 


Error and Step Size Control. RKF 
(Runge-Kutta-Fehlberg) 

The idea of adaptive integration (Sec. 19.5) has analogs for Runge-Kutta (and other) 
methods. In Table 21.4 for RK (Runge-Kutta), if we compute in each step approximations 
.v and y with step sizes h and 2 h, respectively, the latter has error per step equal to 
2 5 = 32 times that of the former; however, since we have only half as many steps for 2 /j, 
the actual factor is 2 5 /2 = 16, so that, say, 

e ah) 16/ k> and thus y <h> — y (2h> = e i2h) - e (h) = (16 — l)^ ,l> . 
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Hence the error e = e ih) for step size h is about 

( 10 ) e - j^( v - v) 

where y — y = y (h> — y (2h \ as said before. Table 21.7 illustrates (10) for the initial value 
problem 

(11) / = (y- x - l) 2 + 2, ,y(0) = 1. 

the step size h = 0.1 and 0 ^ ^ 0.4. We see that the estimate is close to the actual 

error. This method of error estimation is simple but may be unstable. 


Table 21.7 Runge-Kutta Method Applied to the Initial Value Problem (11) 
and Error Estimate (10). Exact Solution y = tan x + x + 1 


x y 

(Step size h) 

y Error Actual 

(Step size 2 / 2 ) Estimate (10) Error 

Exact 

Solution (9D) 

0.0 1.000000 000 
0.1 1.200 334 589 

0.2 1.402 709 878 

0.3 1.609 336 039 

0.4 1.822 792 993 

1 .000 000 000 0.000 000 000 0.000 000 000 

0.000 000 083 

1 .402 707 408 0.000 000 1 65 0.000 000 157 

0.000 000 210 

1.822 788 993 0.000 000 267 0.000 000 226 

1.000000 000 
1.200 334 672 
1.402 710 036 
1.609 336 250 
1.822 793 219 


RKF. E. Fehlberg [Computing 6 (1970), 61—71] proposed and developed error control 
by using two RK methods of different orders to go from (* n , y n ) to (* n+1 , y n+ i). The 
difference of the computed y-values at .v n +i gives an error estimate to be used for step 
size control. Fehlberg discovered two RK formulas that together need only 6 function 
evaluations per step. We present these formulas here because RKF has become quite 
popular. For instance. Maple uses it (also for systems of ODEs). 

Fehlberg’s fifth-order RK method is 

(12a) )’n+i = y n + y + y 6 *6 

with coefficient vector y = [yi • • • y 6 ], 

n2hl v = foe. 0 , 6656 2 8561 _9_ _2l 

\±£D) y [ 135 U 12825 56430 50 55j* 

His fourth-order RK method is 

03a) y* n+1 = y n + + • • • + y\k s 

with coefficient vector 


2197 _n 

4104 5J* 


(13b) 


/ = lift o 


1408 

2565 
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EXAMPLE 5 


In both formulas we use only 6 different function evaluations altogether, namely, 


ki = hf(x n > }’J 

*2 = Ilf tin + \h- 
k -3 = Ilf tin + i*’ 
k 4 = Ilf tin + it*. 

k 5 ~ Ilf tin + h' 

k 6 = hfti'n + \hs 


)’n + 4 * 1 ) 

,V» + ^*1 + &k 2 ) 

}’n 2197*1 — ll 97*2 2197 * 3 ) 

y’n + 2 lf*l — 8^2 + '^ 3 1*3 *3 4104 ^ 4 ) 

3 n ~ 27^1 + 2/: 2 — §§ 65^3 + 4104^4 


40 * 5 )- 


The difference of (12) and (13) gives the error estimate 

( 15 ) €n+l l * 3 , n+l ~ .Vn+1 = 360^1 ” 4275^3 “ 75240^4 ^ 50^5 + ^^6- 


Runge-Kutta-Fehlberg 

For the initial value problem (1 1) we obtain from ( 1 2) — ( 14) with h = 0.1 in the First step the l2S-values 

ki = 0.200000 000000 k 2 = 0.200062 500000 

* 3 = 0.200 140 756867 k 4 = 0.200856 926 1 54 

k 5 = 0.201006 676700 k 6 = 0.200250418651 

vr = 1.200334 66949 

v A = 1.200334 67253 

and the error estimate 

e A - >’i - yt = 0.000000 00304. 

The exact 1 2S-value is y(Q. 1 ) = 1 .200334 67209. Hence the actual error of v A is —4.4 • 1 0“ 10 , smaller than that 
in Table 21.7 by a factor 200. ■ 

Table 21.8 summarizes essential features of the methods in this section. It can be shown 
that these methods are numerically stable (definition in Sec. 19.1). They are one-step 
methods because in each step we use the data of just one preceding step, in contrast to 
multistep methods where in each step we use data from several preceding steps, as we 
shall see in the next section. 


Table 21.8 Methods Considered and Their Order (= Their Global Error) 


Method 

Function Evaluation 
per Step 

Global Error 

Local Error 

Euler 

1 

0(h) 

0(h 2 ) 

Improved Euler 

2 

0(h 2 ) 

0(h 3 ) 

RK (fourth order) 

4 

0(h 4 ) 

0(h 5 ) 

RKF 

6 

0(h 5 ) 

0(h 6 ) 
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EXAMPLE 


Backward Euler Method. Stiff ODEs 

The backward Euler formula for numerically solving (1) is 

(16) Xn+i yn fyfC*7i+l> Jn+l) — 


This formula is obtained by evaluating the right side at the new location (x n+1 , .y n +i)» this 
is called die backward Euler scheme. For known y n it gives y n+1 implicitly , so it defines 
an implicit method, in contrast to the Euler method (3), which gives y n+1 explicitly. 
Hence (16) must be solved for y n+v How difficult this is depends on / in (1). For a linear 
ODE this provides no problem, as Example 6 (below) illustrates. The method is particularly 
useful for “stiff’ ODEs, as they occur quite frequently in the study of vibrations, electric 
circuits, chemical reactions, etc. The situation of stiffness is roughly as follows; for details, 
see, for example, [E5], [E25], [E26] in App. 1. 

Error terms of the methods considered so far involve a higher derivative. And we ask 
what happens if we let h increase. Now if the error (the derivative) grows fast but the 
desired solution also grows fast, nothing will happen. However, if that solution does not 
grow fast, then with growing h the error term can take over to an extent that the numeric 
result becomes completely nonsensical, as in Fig. 450. Such an ODE for which It must 
thus be restricted to small values, and the physical system the ODE models, are called 
stiff. This term is suggested by a mass-spring system with a stiff spring (spring with a 
large k; see Sec. 2.4). Example 6 illustrates that implicit methods remove the difficulty 
of increasing It in the case of stiffness: it can be shown that in the application of an implicit 
method the solution remains stable under any increase of /i, although the accuracy 
decreases with increasing /?. 

Backward Euler Method. Stiff ODE 

The initial value problem 

y' = /(.v. v) = -20v + 20.v 2 + 2x. v(0) = 1 

has the solution (verify!) 

— 20x i 2 
v = e 4- a. 

The backward Euler formula ( 1 6) is 


)'«+l = >’« + l‘f(x n +\- >'n+i) = >’» + /i(-20y n+1 + 20 A- 2 . u + 2.v„ +1 ). 

Noting thatx n+1 = + h , taking the term -20 /;y n+1 to the left, and dividing, we obtain 

y n + M20(.v„ + /Q a + 2(.v„ + /i)] 

06*) J'n+l = 1 + 20/i 

The numeric results in Table 21.9 show the following. 

Stability of the backward Euler method for h = 0.05 and also for h = 0.2 witli an error increase by about a 
factor 4 for h = 0.2. 

Stability of the Euler method for h = 0.05 but instability for h = 0.1 (Fig. 450), 

Stability of RK for/; = 0.1 but instability for/; = 0.2. 

Tliis illustrates that the ODE is stiff. Note that even in the case of stability the approximation of the solution 
near x = 0 is poor. ■ 


Stiffness will be considered further in Sec. 21.3 in connection with systems of ODEs. 
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Fig. 450. Euler method with h = 0.1 for the stiff 
ODE in Example 6 and exact solution 


Table 21.9 Backward Euler Method (BEM) for Example 6. Comparison with Euler and RK 


X 

BEM 
/» = 0.05 

BEM 
h = 0.2 

Euler 
h = 0.05 

Euler 
h = 0.1 

RK 

h = 0.1 

RK 

h = 0.2 

Exact 

0.0 

1.00000 

1.00000 

1.00000 

1.00000 

1.00000 

1.000 

1.00000 

0.1 

0.26188 


0.00750 

- 1.00000 

0.34500 


0.14534 

0.2 

0.10484 

0.24800 

0.03750 

1.04000 

0.15333 

5.093 

0.05832 

0.3 

0.10809 


0.08750 

-0.92000 

0.12944 


0.09248 

0.4 

0.16640 

0.20960 

0.15750 

1.16000 

0.17482 

25.48 

0.16034 

0.5 

0.25347 


0.24750 

-0.76000 

0.25660 


0.25004 

0.6 

0.36274 

0.37792 

0.35750 

1.36000 

0.36387 

127.0 

0.36001 

0.7 

0.49256 


0.48750 

-0.52000 

0.49296 


0.49001 

0.8 

0.64252 

0.65158 

0.63750 

1.64000 

0.64265 

634.0 

0.64000 

0.9 

0.81250 


0.80750 

-0.20000 

0.81255 


0.81000 

1.0 

1.00250 

1.01032 

0.99750 

2.00000 

1.00252 

3168 

1.00000 


BIBBE 


TTT.F 


[m | EULER METHOD 

Do 10 steps. Solve the problem exactly. Compute die error. 
(Show the details.) 

1. y' = y, y( 0) = 1, h = 0.1 

2. / = y, y( 0) = 1, h = 0.01 

3. v ' = (y - a*) 2 , y( 0) = 0, h = 0.1 

4. y' = (y + a) 2 , y( 0) = 0, /? = 0.1 


5-10 


IMPROVED EULER METHOD 


Do 10 steps. Solve exactly. Compute the error. (Show the 
details.) 


5. y* = y, v(0) = 1. h = 0. 1. Compare with Prob. I 
and comment. 


6. (Logistic population) y = y - y 2 y(Q) = 0.2, h = 0. 1 


7. y' - .vy 2 = 0, y(0) = 1. h = 0.1 

8. y' + y tan x = sin 2 a% y(0) = 1, /i = 0.1 

9. Do Prob. 7 using the Euler method with h = 0. 1 and 
compare the accuracy. 

10. y' = 1 - ly 2 , y(0) = 0, /i = 0.1 


11-17 


CLASSICAL RUNGE-KUTTA METHOD 
OF FOURTH ORDER 


Do 10 steps. Compare as indicated. Comment. (Show the 
details.) 

11. y* ~ xy 2 = 0, y(0) = 1, h = 0.1. Compare with 
Prob. 7. Apply (10) to y 10 . 

12. y* = v - y 2 , y(0) = 0.2, h = 0. 1. Compare with 
Prob. 6. Apply (10) to y 10 . 
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13. / = (1 + y( 1) = e,h = 0.2 

14. v' = |(v/.v - x :/v), y(2) = 2, h = 0.2 

15. y f + y tan A' = sin 2a*, y( 0) = 1, h = 0. 1 

16. In Prob. 15 use h = 0.2 (5 steps) and compare die error. 

17. y' + 5a* 4 v 2 = 0, y( 0) = 1, h = 0.2 

18. Kutta’s third-order method is defined by 

y w+ i = v w + + 4 k 2 + k 3 *) with k x and k 2 as in 

RK (Table 21.4) and * 3 * = hf( x n+l , y n - k x + 2 k 2 ). 
Apply this method to (4) in Example 1. Choose 
/?. = 0.2 and do 5 steps. Compare with Table 21.6. 

19. CAS EXPERIMENT. Eulcr-Cauchy vs. RK. 

(a) Solve (5) in Example 2 by Euler, Improved Euler, 
and RK for 0 ^ a ^ 5 with step h = 0.2. Compare the 
errors for x = 1, 3, 5 and comment. 


(b) Graph solution curves of the ODE in (5) for 
various positive and negative initial values. 

(c) Do a similar experiment as in (a) for an initial 
value problem that has a monotone increasing or 
monotone decreasing solution. Compare the behavior 
of the error with that in (a). Comment. 

20. CAS EXPERIMENT. RKF. (a) Write a program for 
RKF dial gives A* n , y n , the estimate (10), and if the 
solution is known, the actual error e n . 

(b) Apply the program to Example 5 in the text 
(10 steps, h = 0.1). 

(c) in (b) gives a relatively good idea of the size 
of the actual error. Is this typical or accidental? Find 
out by experimentation with other problems on what 
properties of the ODE or solution this might depend. 


21 .i Multistep Methods 

In a one-step method we compute y n+1 using only a single step, namely, die previous 
value y n . One-step methods are “self-starting,” they need no help to get going because 
they obtain y x from the initial value y 0 , etc. All methods in Sec. 21.1 are one-step. 

In contrast, a multistep method uses in each step values from two or more previous 
steps. These methods are motivated by the expectation that the additional information will 
increase accuracy and stability. But to get started, one needs values, say, y 0 , y x> y 2 , y 3 in 
a 4-step method, obtained by Runge-Kutta or another accurate method. Thus, multistep 
methods are not self-starting. Such methods are obtained as follows. 

Adams-Bashforth Methods 

We consider an initial value problem 

(1) / = f(x, y), ,v(A- 0 ) = y 0 

as before, with f such that the problem has a unique solution on some open interval 
containing a* 0 . We integrate y f = /(*, y ) from x n to * n+1 = H- /?. This gives 

I y'(x) dx = .v(jc„+i) - y(x n ) = J f(x, y(x)) dx. 

x„ 

Now comes the main idea. We replace fix, v(a)) by an interpolation polynomial p(x) (see 
Sec. 19.3), so that we can later integrate. This gives approximations y n+ j of v(a„ + 1 ) and 
y n of y(x n ). 


.Vn+i = y n + 



( 2 ) 
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Different choices of p(x ) will now produce different methods. We explain the principle 
by taking a cubic polynomial, namely, the polynomial p 3 (jc) that at (equidistant) 


has the respective values 


(3) 


fn f C*n’ Jn) 

fn— 1 f(*n-V 3*«— l) 

fn— 2 fi^n— 2* .Vn— 2 ) 

fn—3 /(•* ?z— 3» An— 3 )* 


This will lead to a practically useful formula. We can obtain p 3 (x) from Newton’s backward 
difference formula (18), Sec. 19.3: 

AsW = fn + rtfn + 2>'0' + i)V 2 /n + §r(r + l)(r + 2)V 3 / n 


where 



We integrate p 3 (x) over x from x n to x n+1 = x n + h, thus over r from 0 to 1. Since 
x = Xn + hr, we have dx = h dr. 

The integral of \r(r + 1) is 5/12 and that of Jr(r + l)(r + 2) is 3/8. We tlius obtain 

r 1 ( l 5 3 \ 

(4) J As rfv = It J°p 3 dr = /i(/ n + y Vf„ + — V 2 /„ + - V 3 / n j . 

It is practical to replace these differences by their expressions in terms of /: 

V/ n = /„ " fn—1 
V 2 / n = / n - 2/ n _ x + fn—2 

V 3 fn = fn~ ^ fn—1 + 3 f n -2 ~ fn- 3- 

We substitute this into (4) and collect terms. This gives the multistep formula of the 
Adams-Bashforth method of fourth order 


(5) 


A’n+1 An ^ (55 f n 59 f n —i + 37/ n _2 


9fn-3)- 


It expresses the new value y n+1 [approximation of the solution y of (I) at x w+1 ] in terms 
of 4 values of / computed from the y - values obtained in the preceding 4 steps. The local 
truncation error is of order h 5 , as can be shown, so that the global error is of order h 4 ; 
hence (5) does defme a fourth-order method. 
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Adams-Moulton Methods 

Adams-Moulton methods are obtained if for p{ x) in (2) we choose a polynomial that 
interpolates /( a \ y(x)) at x n+1 , x n , x n _^ • • • (as opposed to x ni x n _ v • • • used before; this 
is the main point). We explain the principle for the cubic polynomial p 3 (a) that interpolates 
at Xn +lJ x n , x n - v A' n _ 2 * (Before we had x n , x n _ v x n _ 2 , a u _ 3 .) Again using (18) in 
Sec. 19.3 but now setting r = (a* — x n +i)/h , we have 

AW = fn+i + rv/ w+ 1 + |/’(r + l)V 2 /„ + i + gr(r + l)(r + 2)V 3 / Il+1 . 

We now integrate over jc from A n to A n+1 as before. This corresponds to integrating over 
r from - 1 to 0. We obtain 


/ AW ~ \ V/ n+1 - -jj V 2 /„ +1 - V 3 f n+1 J . 

Replacing the differences as before gives 

(6) 3V.+ 1 = 3'n + / AW = v« + (9/n+l + 19/n - 5/ n _! + / w _ 2 ). 

This is usually called an Adams-Moulton formula. It is an implicit formula because 
f n+ 1 = /( y n+1 ) appears on the right, so that it defines y n+1 only implicitly , in 
contrast to (5), which is an explicit formula, not involving y^+x on the right. To use (6) 
we must predict a value y* +1 , for instance, by using (5). that is, 

(7a) y*i+i = .Vn + ^ (55/ n - 59f n -i + 37/ n _ 2 - 9 / n _ 3 ). 

The corrected new value y n+1 is then obtained from (6) with f n+x replaced by 
fn+i = f(*n+ it >n+i) and the other /*s as in (6); thus, 

(7b) y n +i = y n + ^ (9 /* +1 + 19/ w - 5/ w . 1 + /„- 2 ). 

This predictor-corrector method (7a), (7b) is usually called the Adams-Moulton 
method of fourth order. It has the advantage over RK that (7) gives the error estimate 

e ?i+l 55:5 TsO'n+l ~ Jn+l)* 

as can be shown. This is the analog of (10) in Sec. 21.1. 

Sometimes the name ‘Adams-Moulton method’ is reserved for the method with several 
corrections per step by (7b) until a specific accuracy is reached. Popular codes exist for 
both versions of the method. 

Getting Started. In (5) we need / 0 , / x , / 2 , / 3 . Hence from (3) we see that we must first 
compute y ls y 2 , y 3 by some other method of comparable accuracy, for instance, by RK or 
by RKF. For other choices see Ref. [E26] listed in App. 1. 
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EXAMPLE 1 Adams-Bashforth Prediction (7a), Adams-Moulton Correction (7b) 

Solve the initial value problem 

(8) / = x + y, .v(0) = 0 

by (7a), (7b) on the interval 0 ^ x ^ 2, choosing h = 0.2. 

Solution . The problem is the same as in Examples I -3. Sec. 21.1, so that we can compare the results. We 
compute starting values y lP y 2 . y 3 by the classical Runge-Kutta method. Then in each step we predict by (7a) and 
make one correction by (7b) before we execute the next step. The results are shown and compared with the exact 
values in Table 21.10. We see that the corrections improve the accuracy considerably. This is typical. I 


Table 21.10 Adams-Moulton Method Applied to the Initial Value Problem (8); 
Predicted Values Computed by (7a) and Corrected Values by (7b) 


n 


Starting 

y» 

Predicted 

y n * 

Corrected 

.v n 

Exact 

Values 

10 6 • Error 
of3»n 

0 

0.0 

0.000 000 



0.000000 

0 

1 

0.2 

0.021 400 



0.021 403 

3 

2 

0.4 

0.091 818 



0.091 825 

7 

3 

0.6 

0.222 107 



0.222 1 19 

12 

4 

0.8 


0.425 361 

0.425 529 

0.425 541 

12 

5 

1.0 


0.718 066 

0.718 270 

0.718 282 

12 

6 

1.2 


1.119 855 

1.120 106 

1.120 117 

11 

7 

1.4 


1.654 885 

1.655 191 

1.655 200 

9 

8 

1.6 


2.352 653 

2.353 026 

2.353 032 

6 

9 

1.8 


3.249 190 

3.249 646 

3.249 647 

1 

10 

2.0 


4.388 505 

4.389 062 

4.389 056 

-6 


Comments on Comparison of Methods. An Adams-Moulton formula is generally 
much more accurate than an Adams-Bashforth formula of the same order. This justifies 
the greater complication and expense in using the former. The method (7a), (7b) is 
numerically stable , whereas the exclusive use of (7a) might cause instability. Step size 
control is relatively simple. If |Corrector — Predictor) > TOL, use interpolation to generate 
“old” results at half the current step size and then try A/2 as the new step. 

Whereas the Adams-Moulton formula (7a), (7b) needs only 2 evaluations per step, 
Runge-Kutta needs 4; however, with Runge-Kutta one may be able to take a step size 
more than twice as large, so that a comparison of this kind (widespread in the literature) 
is meaningless. 

For more details, see Refs. [E25], [E26] listed in App. 1 . 



1. Cany out and show the details of the calculations 
leading to (4)-(7) in the text. 


2-11 


ADAMS-MOULTON METHOD (7a), (7b) 


Solve the initial value problems by Adams-Moulton, 10 steps 
with 1 correction per step. Solve exactly and compute the 
error. (Use RK where no starting values are given.) 


2. / = y, y(0) = J, A = 0.1 (1.105171, 1.221403, 
1.349859) 

3. y' = — 0.2xy, y{ 0) = 1, A = 0.2 

4. y' = 2xy, y(0) = 1, A = 0.1 

5. / = I + .v 2 , y(0) = 0, A = 0.1 

6. Do Prob. 4 by RK, 5 steps, A = 0.2. Compare the errors. 
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7. Do Prob. 5 by RK, 5 steps, h = 0.2. Compare the errors. 

8. y' = xfy 9 v(l) = 3, // = 0.2 

9. y f — (a* + y — 4) 2 , y(0) = 4, h = 0.2, only 7 steps 
(why?) 

10. y' = 1 - 4y 2 , v(0) = 0, // = 0.1 

11. y' = x + y, y(0) = 0, h = 0.1 (0.00517083, 
0.0214026, 0.0498585) 

12. Show that by applying the method in the text to a 
polynomial of second degree we obtain the predictor 
and corrector formulas 

y«+i = y„ + -£■ (23/ n - + 5 /„_ 2 ), 

-)n+ 1 = JVi (5/n+i “P 8/ u “ / w _i). 

13. Use Prob. 12 to solve y' = 2a*v, y(0) = 1 (10 steps, 
h = 0.1, RK starting values). Compare with the exact 


solution and comment. 

14. How much can you reduce the error in Prob. 13 by 
halving h (20 steps, h = 0.05)? First guess, then 
compute. 

15. CAS PROJECT. Adams-Moulton. (a) Accurate 
starting is important in (7a), (7b). Illustrate this in 
Example 1 of the text by using starting values from the 
improved Euler-Cauchy method and compare the 
results with those in Table 21.9. 

(b) How much does the error in Prob. 1 1 decrease if 
you use exact starting values (instead of RK-vaiues)? 

(c) Experiment to find out for what ODEs poor 
starting is very damaging and for what ODEs it is not. 

(d) The classical RK method often gives the same 
accuracy with step 2h as Adams-Moulton with step 
/?, so that the total number of function evaluations is 
the same in both cases. Illustrate this with Prob. 8. 
(Hence corresponding comparisons in the literature in 
favor of Adams-Moulton are not valid. See also 
Probs. 6 and 7.) 


21.3 Methods for Systems 
and Higher Order ODEs 

Initial value problems for first-order systems of ODEs are of the form 

(i) y' = f(*, y). y(*o) = y<>. 


in components 


y[ = /i(*> >T* * * ‘ 3T(*o) = .Vio 
y } 2 = /2(*> .Vi* • » Vm)* T 2 U 0 ) = ) 7 20 


y’m /?«(•*» ^’l» * ’ * » J?77.)* .V??i,(^o) 3W>* 

f is assumed to be such that the problem has a unique solution y(x) on some open ^-interval 
containing a* 0 . Our discussion will be independent of Chap. 4 on systems. 

Before explaining solution methods it is important to note that (1) includes initial value 
problems for single wth-order ODEs, 


( 2 ) 




and initial conditions y(jc 0 ) = v'(a 0 ) = K 2 , • • • , / TO-1> („to) = K m as special cases. 
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EXAMPLE 1 


Indeed, the connection is achieved by setting 

(3) .vi =j. y* = y > 3’s = .V , • • • . y m = r • 

Then we obtain the system 

3’i ~ 3*2 
3’2 = 3'3 

w j 

3'm = fix, 3’x, • • • , y m ) 

and the initial conditions y x (x 0 ) = K x , = * • • , y m (x 0 ) = K vv 

Euler Method for Systems 

Methods for single first-order ODEs can be extended to systems (I) simply by writing vector 
functions y and f instead of scalar functions y and /, whereas jc remains a scalar variable. 

We begin with the Euler method. Just as for a single ODE this method will not be 
accurate enough for practical purposes, but it nicely illustrates the extension principle. 

Euler Method for a Second-Order ODE. Mass-Spring System 

Solve the initial value problem for a damped mass-spring system 

y" + 2 y + 0.75y = 0, y(0) = 3. y'(0) = -2.5 

by the Euler method for systems with step h = 0.2 for x from 0 to 1 (where .v is time). 

Solution. The Euler method (3). Sec. 21.1. generalizes to systems in the form 

($) y?i4-l ~ Yn ^ y n)» 


in components 

yi.n+1 = 3’l.« + WlfeVl.h-)'2,n) 

3’2,;i+1 = )'2,n + 3’l.n- 3’2.u) 

and similarly for systems of more than two equations. By (4) the given ODE converts to the system 

y'i = /i(*- 3T* .v 2 ) = y 2 

3 2 = / 2 (■*» .Vi* 3’2> = -2y 2 - 0.75.VX. 

Hence (5) becomes 

>1.«+i = 3 r i.n + 0.2y 2>n 

3’2,n+l ~ 3 ? 2,n + 0.2(— 2y 2tn — 0.75yx >n ). 

The initial conditions are y(0) = y x ( 0) = 3, y'(0) = y 2 (0) = -2.5. The calculations are shown in Table 21.11 
on the next page. As for single ODEs, the results would not be accurate enough for practical purposes. The 
example merely serves to illustrate the method because the problem can be readily solved exactly, 

v = Vx = 2e~ 05x + e” 1 - 5 * thus y' = y 2 = -e~ 05x - 1.5<T 1 - 5 * ■ 
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EXAMPLE 2 


Tabie 21.11 Euier Method for Systems in Example 1 (Mass-Spring System) 


n 

-v n 

3Y t « 

>'j Exact 
’ (5D) 

Error 

e l = ?1 ~ 3’l.n 

.V2.w 

v 2 Exact 
’ (5D) 

Error 

*2 = y 2 - y 2 .n 

0 

0.0 

3.00000 

3.00000 

0.00000 

-2.50000 

-2.50000 

0.00000 

1 

0.2 

2.50000 

2.55049 

0.05049 

-1.95000 

-2.01606 

-0.06606 

2 

0.4 

2.11000 

2.18627 

0.76270 

-1.54500 

-1.64195 

-0.09695 

3 

0.6 

1.80100 

1.88821 

0.08721 

- 1 .24350 

-1.35067 

-0.10717 

4 

0.8 

1.55230 

1.64183 

0.08953 

-1.01625 

-1.12211 

-0.10586 

5 

1.0 

1.34905 

1.43619 

0.08714 

-0.84260 

-0.94123 

-0.09863 


Runge-Kutta Methods for Systems 

As for Euler methods, we obtain RK methods for an initial value problem (1) simply by 
writing vector formulas for vectors with m components, which for m = 1 reduce to the 
previous scalar formulas. 

Thus for the classical RK method of fourth order in Table 21.4 we obtain 
(6a) y(A*o) = y 0 (Initial values) 


and for each step n = 0, 1, * * ■ , N — 1 we obtain the 4 auxiliary' quantities 


(6b) 


kj = h f(x n , y n ) 

k 2 = hf(x n + |/t, y„ + |k x ) 

k 3 = hf(x n + \h, y n + 2 k 2 ) 

k 4 = hf(x n + h, y„ + k 3 ) 


and the new value [approximation of the solution y(.v) at jc „. +1 = .v 0 + (n + I )h] 

(6c) y n+1 = y„ + e( k i + 2k 2 + 2k 3 + k 4 ). 


RK Method for Systems. Airy’s Equation. Airy Function Ai(x) 

Solve the initial value problem 

v" = at, y(0) = l/(3 2/3 • T(2/3)) = 0.35502 805. y'(0) = - 1/(3 1/3 * T( 1/3)) = -0.25881 940 

by the Runge-Kutta method for systems with h = 0.2; do 5 steps. This is Airy’s equation, 2 which arose in 
optics (see Ref. [A131, p. 188, listed in App. 1). T is the gamma function (see App. A3.1). The initial conditions 
are such that we obtain a standard solution, the Airy function Ai(.v), a special function that has been thoroughly 
investigated; for numeric values, see Ref. [GRi], pp. 446. 475. 


2 Named after Sir GEORGE BIDELL AIRY (1801-1892), English mathematician, who is known for his work 
in elasticity and in PDEs. 
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Solution . For y tt = ,vy, setting y 1 = y, y 2 = y[ = y' we obtain the system (4) 


y'i = y*2 

> f 2 = *Vl- 


Hence f = [fi / 2 ] T in (1) has the components f x (x. v) = y 2 , / 2 (v, y) = .ryj. We now write (6) in 
components. The initial conditions (6a) are y lf0 = 0.35502 805. y 2t0 = —0.25881 940. In (6b) we have 
fewer subscripts by simply writing k x = a, k 2 = b, k 3 = c, k 4 = d, so that a = [n x n 2 ] T . etc. Then (6b) 
takes the form 


a 


h 


.V2,n 

.¥l.n. 


(6b*) 


_ ^ r.V2.n + \ a 2 

L(*' : n 5^)0 f l,« 4* §^i)_ 

[ .V2,n + ^2 

(■ v n *** 5^)(.Vi.n 

[ .V2,n 4- C 2 

(*w + AXyi,„ ■ 




For example, the second component of b is obtained as follows. f(.v, y) has the second component / 2 (.\\ y) = .vyj. 
Now in b (= k 2 ) the first argument is 

x = x n + £//. 

The second argument in b is 

y = y n + I*. 

and the First component of this is 

.Vl = .Vl.n + i«l- 

Together, 

■v.vi = (x n + yi)(y lM + j«i). 


Similarly for the other components in (6b*). Finally, 

(6c*) y n +i = y n + |( a + 2b + 2c + d). 

Table 21.12 shows the values y(,v) = v 1 (a < ) of the Airy function Ai(.v) and of its derivative y'( x) = y 2 (.v) as well 
as of the (rather small!) error of y(.v). M 


Table 21.12 RK Method for Systems: Values y ljn (x„) of the Airy Function Ai{x) 
in Example 2 


n 

x n 

3'l,n( x n) 

yi(Xn) Exact (8D) 

10 s * Error of y 1 


0 

0.0 

0.35502 805 

0.35502 805 

0 

-0.25881 940 

1 

0.2 

0.30370 303 

0.30370 315 

12 

-0.25240464 

mm 

0.4 

0.25474 211 

0.25474235 

24 

-0.23583 073 

B 

0.6 

0.20979 973 

0.20980006 

33 

-0.21279 185 

B 

0.8 

0.16984 596 

0.16984 632 

36 

-0.18641 171 

B 

1.0 

0.13529 207 

0.13529 242 

35 

-0.15914687 
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EXAMPLE 3 


Runge-Kutta-Nystrom Methods (RKN Methods) 

RKN methods are direct extensions of RK methods (Runge-Kutta methods) to second- 
order ODEs y " = y, y') n as given by the Finnish madiematician E. J. Nystrom {Acta 
Soc. Sci. fenn., 1925, L, No. 13]. The best known of these uses the following formulas, 
where n = 0, 1, • ■ • , N — 1 (N the number of steps): 

= \hf(x n , y n , y' n ) 

k 2 = \hf{x n 4- \h, y n + K, y' n + k x ) where K = §A(v» + k x ) 

^ ^ k 3 = 2 h f( x n + 2 h - Vn + Vn + h) 

k 4 = \hf(x n + h, y n + L, y' n + 2 k 3 ) where L = h (y„ + k 3 ). 

From this we compute the approximation y n±1 of y(x w+1 ) at x n+1 = x 0 + in -f 1)/?, 

(7b) V«+i = y n + h{y' n + \(k x + k 2 + k 3 )), 

and the approximation y 7 ' l+1 of the derivative y'(x n+1 ) needed in the next step, 

(7c) Vn+i = Vn + 3 (k x + 2k 2 + 2k 3 + k 4 ). 

RKN for ODEs y" = f{x,y) Not Containing y'. Then k 2 = k 3 in (7), which makes 
the method particularly advantageous and reduces (7) to 


k i = 2 hf(xn, y n ) 

k 2 = \hf(x n + |/j, Vn + | Kyh + 2*l)) = k 3 


(7*) k 4 = 2 hf(x n + h, v„ + hfy'n + k 2 )) 

)’«+l = Vn + MVn + §(*1 + 2* 2 )) 

Vn+l = .Vn + 3(^1 + 4^2 + k 4 ). 

RKN Method. Airy’s Equation. Airy Function Ai(x) 

For the problem in Example 2 and h = 0.2 as before we obtain from (7*) simply = 0.lA* tt y n and 
*2 ~ *3 - 0. 1 (.v n + 0.1)(y n + 0.1 v^j + 0.05A' X ), /r 4 = 0. 1 (x n + 0.2)(y n + 0.2y„ + 0.2k 2 ). 

Table 21.13 shows the results. The accuracy is the same as in Example 2, but the work was much less. I 


Table 21.13 Runge-Kutta-Nystrom Method Applied to Airy’s Equation, Computation of 
the Airy Function y = Ai(x) 



y» 

t 

yn 

y(x) Exact (8D) 

10 8 • Error 
OfVn 

0.0 

0.355 028 05 

-0.258 81940 

0.355 028 05 

0 

0.2 

0.303 703 04 

-0.252 404 64 

0.303 703 15 

11 

0.4 

0.254 742 1 1 

-0.235 830 70 

0.254742 35 

24 

0.6 

0.209 799 74 

-0.212 79172 

0.209 800 06 

32 

0.8 

0.169 845 99 

-0.186411 34 

0.169 846 32 

33 

1.0 

0.135 292 18 

-0.159 146 09 

0.135 292 42 

24 
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EXAMPLE 4 


Our work in Examples 2 and 3 also illustrates that usefulness of methods for ODEs in the 
computation of values of “higher transcendental functions*” 

Backward Euler Method for Systems. Stiff Systems 

The backward Euler formula (16) in Sec. 21.1 generalizes to systems in the form 

(®) Yn+1 Yn h 1* Yrn-l) 0* ” U * ’)• 

This is again an implicit method, giving y n+1 implicitly for given y n . Hence (8) must be 
solved for y n+1 . For a linear system this is shown in the next example. This example also 
illustrates that, similar to the ease of a single ODE in Sec. 21.1, the method is very useful 
for stiff systems. These are systems of ODEs whose matrix has eigenvalues A of very 
different magnitudes, having the effect that, just as in Sec. 2 1 . 1 , the step in direct methods, 
RK for example, cannot be increased beyond a certain threshold without losing stability. 
(A = — 1 and -10 in Example 4, but larger differences do occur in applications.) 

Backward Euler Method for Systems of ODEs. Stiff Systems 

Compare ihe backward Euler method (8) with the Euler and the RK methods for numerically solving the initial 
value problem 

_y" 4- 11/ 4- lOy = IO.v + 1 1, y(0) = 2, v'(0) = -10 

converted to a system of first-order ODEs. 

Solution . The given problem can easily be solved, obtaining 

so that we can compute errors. Conversion to a system by setting y = y v y' = y 2 [see (4)] gives 

A = V2 Vi(0) = 2 

.v 2 = ~\0y x - 1 Iy 2 + IO.v 4- 1 1 y 2 (0) = -10. 

The coefficient matrix 


' o r 


-A 1 

A = 

has the characteristic determinant 


.-10 -11. 


-10 -A - II 


whose value is A 2 4- 1 1 A + 10 = (A + 1)(A 4- 10). Hence the eigenvalues, are - 1 and - 10 as claimed above. 
The backward Euler formula is 


y i.«+i _ >’2.»+i 

,- v 2.n+lJ L V 2,trJ L”^* V l,n+l ” 1 h'2,n+l + l0.V n+1 4 * 1 1_ 
Reordering terms gives the linear system in the unknowns and y 2 . n » 1 

JTn+l ” hy2,n+l ~ 3 r l,« 

10A >T t n+I + U + 1 l%2,w+ 1 ” V2,n + 10* C% + h) + 1 1 it. 

The coefficient determinant is D = 1 4- 11 h 4* 10/t 2 , and Cramers rule (in Sec. 7.6) gives the solution 

(I + I l% ltW + hy 2m n + 10/f 2 V n 4- ll// 2 4- 10/z 3 ’ 

- 10/zv lr „ 4 - y 2 n 4 I0/u*„ 4 - J \h 4- m Z 


)Wl “ D 


y ? l-rl = 
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Table 21.14 Backward Euler Method (BEM) for Example 4. Comparison with Euler and RK 


-V 

BEM 
A = 0.2 

BEM 
A = 0.4 

Euler 
// = 0.1 

Euler 
A = 0.2 

RK 

A = 0.2 

RK 

h = 0.3 

Exact 

0.0 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

0.2 

1.36667 


1.01000 

0.00000 

1.35207 


1.15407 

0.4 

1.20556 

1.31429 

1.56100 

2.04000 

1.18144 


1.08864 

0.6 

1.21574 


1.13144 

0.1 1200 

1.18585 

3.03947 

1.15129 

0.8 

1.29460 

1.35020 

1.23047 

2.20960 

1.26168 


1.24966 

1.0 

1.40599 


1.34868 

0.32768 

1.37200 


1.36792 

1.2 

1.53627 

1.57243 

1.48243 

2.46214 

1.50257 

5.07569 

1.50120 

1.4 

1.67954 


1.62877 

0,60972 

1.64706 


1.64660 

1.6 

1.83272 

1.86191 

1.78530 

2.76777 

1.80205 


1.80190 

1.8 

1.99386 


1.95009 

0.93422 

1.96535 

8.72329 

1.96530 

2.0 

2.16152 

2.18625 

2.12158 

3.10737 

2.13536 


2.13534 


Table 21.14 shows the following. 

Stability of the backward Euler method for h = 0.2 and 0.4 (and in fact for any h; try h = 5.0) with decreasing 
accuracy for increasing h. 

Stability of the Euler method for It = 0.1 but instability for h = 0.2, 

Stability of RK for h = 0.2 but instability for h = 0.3. 

Figure 45 1 shows the Euler method for h = 0. 1 8, an interesting case with initial jumping (for about x < 3) but 
later monotone following the solution curve of y = y\. See also CAS Experiment 21. ■ 



Fig. 451. Euler method with h = 0.18 in Example 4 



1. Verify the calculations in Example 1 . 


2r- 1 


EULER FOR SYSTEMS 
AND SECOND-ORDER ODES 


Solve by the Euler method: 

2 - yi - ~3.Vi + .v 2 » yL = )’i “ 3y 2 , >i( 0) = 2, y 2 (0) = 0, 
h — 0.1 t 5 steps 

3 - .Vi = Vi. y 2 = 3’2> .Vi(0) = 1. ,v 2 (0) = -!,/) = 0.2, 


5 steps 

4. y\ = 3’i> y'i = “3’2. 3'iCO) = 2, y 2 (0) = 2, A = 0.1, 
10 steps 

5. y" + 4y = 0, y(0) = 1. _y'(0) = 0, A = 0.2, 
5 steps 

6. y" - y = x, 3’(0) = I, v'(0) = -2. A = 0.1, 
5 steps 
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7. y[ = -y x + y 2 > y 2 = -vi - v 2 , yi(0) = 0, 
y 2 ( 0) = 4, h = 0.1, 10 steps 


8. Verify the formulas and calculations for the Airy 
equation in Example 2. 

1 9-14 1 RK FOR SYSTEMS 

Solve by the classical RK: 

9. The system in Prob. 7. How much smaller is the error? 

10. The ODE in Prob. 6. By what factor did the error 
decrease? 

11. Undamped Pendulum, y" 4- siny = 0, y(7f) = 0, 
y'('jf) = 1, h = 0.2, 5 steps. How does your result fit 
into Fig. 92 in Sec. 4.5? 

12. Bessel Function J 0 . xy” 4* y* 4- xy = 0, 

y(l) =0.765 198, y'(l) = -0.440051,/? = 0.5, 5 steps. 
(This gives the standard solution y 0 (jc) in Fig. 107 in 
Sec. 5.5.) 

13. y[ = “4yi + y 2 , y 2 = y x ~ 4y 2 , >‘i(0) — 0, 
y 2 (0) = 2, /? = 0.1, 5 steps 

14. The system in Prob. 2. How much smaller is the error? 


15. Verify the calculations for the Airy equation in 
Example 3. 


16-19 


RUNGE-KUTTA-NYSTROM METHOD 


Do by RKN: 


16. Prob. 12 (Bessel function / 0 ). Compare the results. 


17. y" - xy’ + Ay = 0, y( 0) = 3, /(0) = 0, 
h = 0.2, 5 steps (Exact: y = .v 4 — 6,v 2 + 3.) 

18. (.v 2 - a')v" - xy' + y = 0, y(|) = 1 - | In 2, 
,y'(|) = 1 - In 2, h = 0.1, 4 steps 

19. Prob. 11. Compare the results. 

20. CAS EXPERIMENT. Comparison of Methods, (a) 
Write programs for RKN and RK for systems. 

(b) Try them out for second-order ODEs of your 
choice to find out empirically which is better in specific 
cases. 

(c) In using RKN, would it pay to first eliminate y' 
(see Prob. 29 in Problem Set 5.5)? Find out 
experimentally. 

21. CAS EXPERIMENT. Backward Euler and 
Stiffness. Extend Example 4 as follows. 

(a) Verify the values in Table 21.14 and show them 
graphically as in Fig. 451. 

(b) Compute and graph Euler values for h near the 
“critical” h = 0.18 to determine more exactly when 
instability starts. 

(c) Compute and graph RK values for values of /? 
between 0.2 and 0.3 to find h for which the RK 
approximation begins to increase away from the exact 
solution. 

(d) Compute and graph backward Euler values for 
large h; confirm stability and investigate the error 
increase for growing h. 


21.4 Methods for Elliptic PDEs 

The remaining sections of this chapter are devoted to numerics for PDEs (partial 
differential equations), particularly for the Laplace, Poisson, heat, and wave equations. 
These PDEs are basic in applications and, at the same time, are model cases of elliptic, 
parabolic, and hyperbolic PDEs, respectively. The definitions are as follows, (recall also 
Sec. 12.4). 

A PDE is called quasilinear if it is linear in the highest derivatives. Hence a second- 
order quasilinear PDE in two independent variables x , v is of the form 

(1) 4* / 2.bll X y 4* Cltyy _V, Hj Uy) . 

u is an unknown function of x and y (a solution sought). F is a given function of the 
indicated variables. 

Depending on the discriminant ac - b 2 , the PDE (l) is said to be of 

elliptic type if ac — b 2 > 0 (example: Laplace equation) 

parabolic type if ac - b 2 = 0 (example: heat equation) 

hyperbolic type if ac — b 2 < 0 (example: wave equation ). 
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Here, in the heat and wave equations, v is time t. The coefficients a y b, c may be functions 
of a, y, so that the type of (1) may be different in different regions of the xv-plane. This 
classification is not merely a formal matter but is of great practical importance because 
the general behavior of solutions differs from type to type and so do the additional 
conditions (boundary and initial conditions) that must be taken into account. 

Applications involving elliptic equations usually lead to boundary value problems in a 
region R , called a first boundary value problem or Dirichlet problem if u is prescribed 
on the boundary curve C of /?, a second boundary value problem or Neumann problem 
if u n = du/dn (normal derivative of u) is prescribed on C, and a third or mixed problem 
if u is prescribed on a part of C and u n on the remaining part. C usually is a closed curve 
(or sometimes consists of two or more such curves). 


Difference Equations for the Laplace and 
Poisson Equations 

In this section we consider the Laplace equation 

(2) V 2 W = ll XX + Uyy = 0 

and the Poisson equation 

(3) V 2 m = u xx + Uyy = f(x, y). 

These are the most important elliptic PDEs in applications. To obtain methods of numeric 
solution, we replace the partial derivatives by corresponding difference quotients, as 
follows. By the Taylor formula, 

(a) u(x + A, y) = u( a, y) + hu x {x, y) + %h 2 u xx (x s v) + %h 3 u xxx (x, y) + • • • 

(4) 

(b) u{x - lu y) = u(x. v) - hujx. v) + £A 2 w**(jc, .v) - $h z u xa J L x. y) + • • • 

We subtract (4b) from (4a), neglect terms in /i 3 , /? 4 , • • • , and solve for u x . Then 

(5a) u x (x, y) = [m(a + h, v) - u(x - h, v)]. 

zn 

Similarly, 

u(x, y + k) = u(x, y) + ku y (x, y) + %k 2 ityy(x, y) + • • • 
and 

u(x, y - k) = u(x, y) - ka y (x, y) + \k 2 u yy (x, y) + • • ■ . 


By subtracting, neglecting terms in k 3 , k 4 , • • • , and solving for u y we obtain 


u v( x > >’) * [w(-v, y + k) — u(x, y - *)]. 


(5b) 
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We now turn to second derivatives. Adding (4a) and (4b) and neglecting terms in 
/z 4 , /z 5 , • • • , we obtain u(x -f h, y) -f u(x — h> .v) ~ 2w(x, y) -f hPuxJx, y). Solving for 
u^, we have 

(6a) u xx (x, y) = L [«(* + /?, y) - 2 u(x, y) + u(x - h, y)]. 

Similarly, 

(6b) Uyy (x, y) « -p- [u(x, y + k) - 2 u(x, y) + u(x, y - *)]. 

We shaU not need (see Prob. 1) 

.. . u x y(x, .v) ~ - 7 - [u(x + h, y + k) - u(x - h, y + k) 

(6c) 4 hk 

— u(x + K y — k) 4- u(x — h, y — k)]. 

Figure 452a shows the points (jc + /?, _y), (x — h, y), • • • in (5) and ( 6 ). 

We now substitute ( 6 a) and ( 6 b) into the Poisson equation (3), choosing k = h to obtain 
a simple formula: 

(7) u(x + A, y) + m(a\ y + h) 4- u(x - /?., y) 4- u(x y y - h) - 4 u(x, y) = h 2 f(x> y). 

This is a difference equation corresponding to (3). Hence for the Laplace equation (2) 
the corresponding difference equation is 

(8) u(x + h, y) + u{x ; y 4- h) + u(x — h y y) 4- u(x, y — h) — 4 w(x, y) = 0. 

h is called the mesh size. Equation ( 8 ) relates it at (x, y) to u at the four neighboring points 
shown in Fig. 452b. It has a remarkable interpretation: u at (x, y) equals the mean of the 
values of u at the four neighboring points. This is an analog of the mean value property 
of harmonic functions (Sec. 18.6). 

Those neighbors are often called E (East), N (North), W (West), S (South). Then 
Fig. 452b becomes Fig. 452c and (7) is 

(7*) u(E) 4 it(N) 4 u(W) 4 u(S) - 4 u(x, y) = h 2 f(x , y). 


(x, y + k) 
X 


(x, y + h ) 
X 


N 

X 


(x-h,y) X- 


<> 


( x,y ) 


X 

(x,y-k) 


X (x + k, y) (x-h,y) X Q 


(*i y) 


X 

(x,y-h) 



X E 


(a) Points in (5) and (6) (b) Points in (7) and (8) 

Fig. 452. Points and notation in (5)-(8) and (7*) 


(c) Notation in (7*) 
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Our approximation of h 2 V 2 u in (7) and (8) is a 5-point approximation with the 
coefficient scheme or stencil (also called pattern, molecule , or star) 



1 

> 


i 

(9) - 

1 -4 

1 

> . We may now write (7) as < 

1-4 1 > 


1 

> 


1 


u = h 2 f( a\ y). 


Dirichlet Problem 

In numerics for the Dirichlet problem in a region R we choose an /? and introduce a square 
grid of horizontal and vertical straight lines of distance h. Their intersections are called 
mesh points (or lattice points or nodes). See Fig. 453. 

Then we approximate the given PDE by a difference equation [(8) for the Laplace equation], 
which relates the unknown values of u at the mesh points in R to each other and to the given 
boundary values (details on p. 913). This gives a linear system of algebraic equations. By 
solving it we get approximations of the unknown values of u at the mesh points in R. 

We shall see that the number of equations equals the number of unknowns. Now comes 
an important point. If the number of internal mesh points, call it p, is small, say, p < 100, 
then a direct solution method may be applied to that linear system of/7 < 100 equations 
in p unknowns. However, if p is large, a storage problem will arise. Now since each 
unknown it is related to only 4 of its neighbors, the coefficient matrix of the system is a 
sparse matrix, that is, a matrix with relatively few nonzero entries (for instance, 500 of 
10000 when p = 100). Hence for large p we may avoid storage difficulties by using an 
iteration method, notably the Gauss-Seidel method (Sec. 20.3), which in PDEs is also 
called Liebmann’s method. Remember that in this method we have the storage 
convenience that we can overwrite any solution component (value of u) as soon as a “new” 
value is available. 

Both cases, large p and small /;, are of interest to the engineer, large p if a fine grid is 
used to achieve high accuracy, and small p if the boundary values are known only rather 
inaccurately, so that a coarse grid will do it because in this case it would be meaningless 
to try for great accuracy in the interior of the region R. 

We illustrate this approach with an example, keeping the number of equations small, 
for simplicity. As convenient notations for mesh points and corresponding values of the 
solution (and of approximate solutions) we use (see also Fig. 453) 

( 10 ) P {j = (ih, jit), itjj = u(ihy jli). 



Fig. 453. Region in the xy-plane covered by a grid of mesh h , 
also showing mesh points = (h, /?),•••, P V} = (ih r jh), • • • 
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EXAMPLE 1 


With this notation we can write (8) for any mesh point P % in the form 

( 11 ) t'k+lj ^ iyj+l 1 0 . 


Laplace Equation. Liebmann’s Method 

The four sides of a square plate of side 12 cm made of homogeneous material are kept at constant temperature 
0°C and I00°C as shown in Fig. 454a. Using a (very wide) grid of mesh 4 cm and applying Liebmann’s method 
(that is. Gauss-Seidel iteration), find the (steady-state) temperature at the mesh points. 

Solution . In the case of independence of time, the heat equation (see Sec. 10.8) 

lt i = c ( tl xx u yy) 


reduces to the Laplace equation. Hence our problem is a Dirichlet problem for the latter. We choose the grid 
shown in Fig. 454b and consider the mesh points in the order P 12 , P% 2 ' We use (1 1) and, in each 

equation, take to the right all the terms resulting from the given boundary values. Then we obtain the system 


(12) 


-41/n 4- «2X + M| 2 = “200 


Mjj 4«21 M 22 — — 200 

Nil “ 4«12 + N 2 2 = —100 


"21 + «12 ” 4«22 ” ”100. 


In practice, one would solve such a small system by the Gauss elimination, finding « u = u 2 \ = 87.5, 
a 12 = ll 22 = 62.5. 

More exact values (exact to 3S) of the solution of the actual problem |as opposed to its model (12)1 are 88. 1 
and 61.9, respectively. (These were obtained by using Fourier series.) Hence the error is about 1%. which is 
surprisingly accurate for a grid of such a large mesh size h . If the system of equations were large, one would 
solve it by an indirect method, such as Liebmann’s method. For (12) this is as follows. We write (12) in the 
form (divide by -4 and take terms to the right) 

Mu = 0.25«2i 0.25// 12 50 

m 2 i = 0.25 mh 4- 0.25h 2 2 “1“ 50 

Hj 2 — 0.25//j| 4* 0.25«22 4* 25 

m 22 = 0.25«2i + 0.25«i2 4- 25. 

These equations are now used for the Gauss-Seidel iteration. They arc identical with (2) in Sec. 20.3, where 
H n = x i> w 2i = x 2> M i 2 = u 22 = * 4 > ancI iteration is explained there, with 100. 100, 100, 100 chosen 

as starting values. Some work can be saved by better starting values, usually by taking the average of the 
boundary values that enter into the linear system. The exact solution of the system is n n = w 21 = 87.5, 
Mi 2 = m 22 = 62.5, as you may verify. 




100 


(a) Given problem 

Fig. 454. 


(b) Grid and mesh points 
Example 1 
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Remark, It is interesting to note dial if we choose mesh // = Lin ( L = side of R ) and consider the (/i — l) 2 
internal mesh points (i.e., mesh points not on the boundary) row by row in the order 

^ 11 - ^ 21 * ' * • P 12' P 22' * * * • P n— 1 . 2 * 

then the system of equations has the (n - l ) 2 X (/j - l ) 2 coefficient matrix 


"B I 


"-4 1 

1 B I 


l -4 1 


Here B = 


1 B I 


1 -4 1 

i* I B_ 


1 -4. 


is an (// - 1) X (n — 1) matrix. (In (12) we have n = 3. (h - l) 2 = 4 internal mesh points, two submatrices 
B, and two submatrices I.) The matrix A is nonsingular. This follows by noting that the off-diagonal entries in 
each row of A have the sum 3 (or 2), whereas each diagonal entry of A equals —4, so that nonsingularity is 
implied by Gerschgorin's theorem in Sec. 20.7 because no Gerschgorin disk can include 0. U 


A matrix is called a band matrix if it has all its nonzero entries on the main diagonal 
and on sloping lines parallel to it (separated by sloping lines of zeros or not). For example, 
A in ( 13) is a band matrix. Although the Gauss elimination does not preserve zeros between 
bands, it does not introduce nonzero entries outside the limits defined by the original 
bands. Hence a band structure is advantageous. In (13) it has been achieved by carefully 
ordering the mesh points. 


ADI Method 

A matrix is called a tridiagonal matrix if it has all its nonzero entries on the main diagonal 
and on the two sloping parallels immediately above or below the diagonal. (See also 
Sec. 20.9.) In this case the Gauss elimination is particularly simple. 

This raises the question of whether in the solution of the Dirichlet problem for the 
Laplace or Poisson equations one could obtain a system of equations whose coefficient 
matrix is tridiagonal. The answer is yes, and a popular method of that kind, called the 
ADI method (alternating direction implicit method) was developed by Peaceman and 
Rachford. The idea is as follows. The stencil in (9) shows that we could obtain a tridiagonal 
matrix if there were only the three points in a row (or only the three points in a column). 
This suggests that we write ( 1 1 ) in the form 

(14a) - 4 Uy + Hi* u = -Ui.j-1 - 

so that the left side belongs to 3 ’-Row j only and the right side to A-Column i. Of course, 
we can also write (11) in the form 

(14b) 4w^- 4* Uj t j + 1 — iti—ij 

so that the left side belongs to Column / and the right side to Row j. In the ADI method 
we proceed by iteration. At every mesh point we choose an arbitrary starting value uffl. 
In each step we compute new values at all mesh points. In one step we use an iteration 
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EXAMPLE 


formula resulting from (14a) and in the next step an iteration formula resulting from (14b), 
and so on in alternating order. 

In detail: suppose approximations u*^ l) have been computed. Then, to obtain the next 
approximations My l+1 \ we substitute the it# 1 * on the right side of (14a) and solve for the 
u ij l+l} on teft side; that is, we use 

(15a) - 4i4r w + = -“ij-i ~ “tU 


We use (15a) for a fixed f that is, for a fixed row 7 , and for all internal mesh points in 
this row. This gives a linear system of N algebraic equations (N = number of internal 
mesh points per row) in N unknowns, the new approximations of u at these mesh points. 
Note that (15a) involves not only approximations computed in the previous step but also 
given boundary values. We solve the system (15a) (7 fixed!) by Gauss elimination. Then 
we go to the next row, obtain another system of N equations and solve it by Gauss, and 
so on, until all rows are done. In the next step we alternate direction , that is, we compute 
the next approximations m|T* 2> column by column from the / 4 } H * Hl) and the given boundary 
values, using a formula obtained from (14b) by substituting the u^ l+1) on the right : 


1 


For each fixed /, that is, for each column , this is a system of M equations ( M — number 
of internal mesh points per column) in M unknowns, which we solve by Gauss elimination. 
Then we go to the next column, and so on, until all columns are done. 

Let us consider an example that merely serves to explain the entire method. 

Dirichlet Problem. ADI Method 

Explain the procedure and formulas of ihe ADI method in terms of the problem in Example l. using the same 
grid and starting values 100. 100. 100. 100. 

Solution . While working, we keep an eye on Fig. 454b on p. 913 and the given boundary values. We obtain 
first approximations u*if Mgi «i2» u 22 f rom < 15a) with m = 0. Wc write boundary values contained in (15a) 
without an upper index, for better identification and 10 indicate that these given values remain the same during 
the iteration. From (15a) with m = 0 we have for j = 1 (first row) the system 

(/ = 1) Hqi - 4wiY + m 21 = “ w io “ «12 

(/ = 2) MiV “ 4// 2 V + »31 = “1*20 ~ M 22* 

The solution is u™ = 11*21 = 100* For / - 2 (second row) we obtain from (15a) the system 

0 — I) ttQ2 ~ 4x/i2 + t( 22 = “Ml? ~ ^13 

O' = 2) XX 12 - 44 V + "32 = -«21 “ X/23- 

The solution is 4V = x / 22 — 66.667. 

Second approximations 11 * 11 . **21 ■ lf \% ll< 22 are now obtained from (J5b) with m = 1 by using the first 
approximations just computed and the boundary values. For i = I (first column) we obtain from (15b) the system 

O = 1 ) x / 10 — 4 t 1*11 + M12 = “//qi ~ u< 2 \ 

(j = 2) ifff - 4//J2 + "l3 = ~«02 - “22- 

The solution is 4V = 91.1 1. 11*12 = 64.44. For i — 2 (second column) we obtain from (15b) the system 

O = 1 ) x/20 — 4x# * 2 \ + lt *22 — ~xx Vi — x/31 

(j = 2) 11*21 ~ 4l#22 + ll 23 = ”«12 “ m 32- 

The solution is = 91.1 1, 11*22 = 64.44. 
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In ihis example, which merely serves to explain the practical procedure in the ADI method, the accuracy 
of the second approximations is about the same as that of two Gauss-Seidel steps in Sec. 20.3 (where 
«n ” *i, "21 = v 2» "i2 = "22 = - v 4)> as the following table shows. 


Method 

Mn 

W 2 1 

«12 

M22 

ADI, 2nd approximations 

91.11 

91.11 

64.44 

64.44 

Gauss-Seidel, 2nd approximations 

93.75 

90.62 

65.62 

64.06 

Exact solution of (12) 

87.50 

87.50 

62.50 

62.50 


Improving Convergence. Additional improvement of the convergence of the ADI 
method results from the following interesting idea. Introducing a parameter/;, we can also 
write (11) in the form 

(а) Ui_ hj - (2 + p)u ij + iii+u = -Uij_ 1 + (2 - p)Uij - a u+1 

(16) 

(б) ^i,j — X (“ P)^ij ^i,j + 1 ~^i—l ,j "1" (2 P) 

This gives the more general ADI iteration formulas 

(a) a&tf “(2 4- 4 = -iifti 1 4 (2 - p)utf> ~ u<& 1 

(17) 

(b) -(2 4 p)utf+» 4 11 %:? = 4 (2 - p)ufr +l ' “ u?:ty. 

For p = 2, this is (15). The parameter p may be used for improving convergence. Indeed, 
one can show that the ADI method converges for positive /?, and that the optimum value 
for maximum rate of convergence is 


(18) Po = 2 sin ^ 

where K is the larger of M 4 1 and N 4 1 (see above). Even better results can be achieved 
by letting p vary from step to step. More details of the ADI method and variants are 
discussed in Ref. [E25] listed in App. 1. 



1. Derive (5b), (6b), and (6c). 


2-7 


GAUSS ELIMINATION, 
GAUSS-SEIDEL ITERATION 


For the grid in Fig. 455 compute the potential at the four 
internal points by Gauss and by 5 Gauss-Seidel steps with 
starting values 100, 100, 100, 100 (showing the details of 
your work) if the boundary values on the edges are: 

2. it = 0 on the left, a* 3 on the lower edge, 27 — 9y 2 on 
the right, .r 3 — 27.v on the upper edge. 


3. m( 1 , 0) = 60, m(2, 0) = 300, 11 - 100 on the other three 
edges. 

4. u = a 4 on the lower edge, 8 1 — 54y 2 4 y 4 on the right. 
a 4 — 54a* 2 4 81 on the upper edge, y 4 on the left. 
Verify the exact solution a* 4 - 6a* 2 v 2 4 y 4 and 
determine the error. 

5. u = sin §tta' on the upper edge, 0 on the other edges. 
10 steps. 

6. u = 220 on the upper and lower edges, 1 10 on the left 
and right. 
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7. Vi, on the upper and lower edges, - V 0 on the left and 
right. Sketch the equipotential lines. 


use symmetry; take it = 0 as the boundary value at the 
two points at which the potential has a jump. 



'y 

10 

fn 


*u , 

, P 21 





A* 


Fig. 455. Problems 2-7 


8. Verify the calculations in Example 1. Find out 
experimentally how many steps are needed to obtain the 
solution of the linear system with an accuracy of 3S. 

9. (Use of symmetry) Conclude from the boundary 
values in Example I that w 2 i = «n and u 2 2 = ^ 12 - 
Show that this leads to a system of two equations and 
solve it. 

10. (3x3 grid) Solve Example 1, choosing h = 3 and 
starting values 100, 100, • • • . 

11. For the square 0 ^ x = 4, 0 ^ y ^ 4 let the boundary 
temperatures be 0°C on the horizontal and 50°C on the 
vertical edges. Find the temperatures at the interior 
points of a square grid with h - I . 

12. Using the answer to Prob. II, try to sketch some 
isotherms. 

13. Find the isotherms for the square and grid in Prob. 1 1 
if u = sin qTtx on the horizontal and —sin y on the 
vertical edges. Try to sketch some isotherms. 

14. (Influence of starting values) Do Prob. 5 by 
Gauss-Seidel, starting from 0. Compare and comment. 

15. Find the potential in Fig. 456 using (a) the coarse grid, 
(b) the fine grid, and Gauss elimination. Hint. In (b), 


u= 110 V 


u - 110 V 


tt = -I10V 





P n 




«=110V 


H=-110V 


I< = -110V 

Fig. 456. Region and grids in Problem 15 


16. (ADI) Apply the ADI method to the Dirichlet problem 
in Prob. 5, using the grid in Fig. 455, as before and 
starting values zero. 

17. What Pq in (1 8) should we choose for Prob. 16? Apply 
the ADI formulas (17) with p 0 = 1.7 to Prob. 16, 
performing 1 step. Illustrate the improved convergence 
by comparing with the corresponding values 0.077, 
0.308 after the first step in Prob. 16. (Use the starting 
values zero.) 

18. CAS PROJECT. Laplace Equation, (a) Write a 
program for Gauss-Seidel with 16 equations in 16 
unknowns, composing the matrix (13) from the 
indicated 4X4 submatrices and including a 
transformation of the vector of the boundary values into 
the vector b of Ax = b. 

(b) Apply the program to the square grid in 0 ^ ^ 5, 
0 ^ y ^ 5 with h = I and // = 220 on the upper and 
lower edges, u - 1 10 on the left edge and u = —10 
on the right edge. Solve the linear system also by Gauss 
elimination. What accuracy is reached in the 20th 
Gauss-Seidel step? 


21.5 Neumann and Mixed Problems. 

Irregular Boundary 

We continue our discussion of boundary value problems for elliptic PDEs in a region R 
in the A\y-plane. The Dirichlet problem was studied in the last section. In solving Neumann 
and mixed problems (defined in the last section) we are confronted with a new situation, 
because there are boundary points at which the (outer) normal derivative u n = du/dn of 
the solution is given, but u itself is unknown since it is not given. To handle such points 
we need a new idea. This idea is the same for Neumann and mixed problems. Hence we 
may explain it in connection with one of these two types of problems. We shall do so and 
consider a typical example as follows. 
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EXAMPLE 1 


Mixed Boundary Value Problem for a Poisson Equation 


Solve the mixed boundary value problem For the Poisson equation 

V 2 i/ = "xx + = /(- v * - v > = ,2 * v - v 

shown in Fig. 457a. 




(a) Region i? and boundary values (b) Grid ( h = 0.5) 

Fig. 457. Mixed boundary value problem in Example 1 


Solution . We use (he grid shown in Fig. 457b, where h = 0.5. We recall that (7) in Sec. 21.4 has the right 
side h 2 f(.\\ y) - 0.5 2 • I2.vv = 3.vy. From the formulas u = 3y 3 and u n = 6 .v given on the boundary we compute 
the boundary data 


(l) « 31 = 0.375. 


"32 = 3 . 


<*“12 

dn 


d"l2 

dy 


= 6 • 0.5 = 3, 


dll 22 
dn 


du 22 

dy 


6 - I 


= 6 . 


Pn and P 21 are internal mesh points and can be handled as in the last section. Indeed, from (7). Sec. 21.4, with 
h 2 = 0.25 and h 2 f(x, >') = 3.vv and From the given boundary values we obtain two equations corresponding to 
P n and P 2 \. as follows (with —0 resulting from the left boundary). 


( 2 a) 


— 4«n 4- w 2 i 4- « 12 = 12(0.5 *0.5) *5 — 0 = 0.75 

"n ~ 4 « 2 i + "22 = 12(1 • 0.5) - 0.375 = 1.125 


The only difficulty with these equations seems to be that they involve the unknown values « 12 and u 2 2 of it at 
P 12 and P 2 2 on the boundary, where the normal derivative « n = duldn - dufdy is given, instead of «; but we 
shall overcome this difficulty as follows. 

We consider P 12 and P 22 . The idea that will help us here is this. Wc imagine the region R to be extended 
above to the first row of external mesh points (corresponding to y = 1.5), and we assume that the Poisson 
equation also holds in the extended region. Then we can write down two more equations as before (Fig. 457b) 


(2b) 


«u - 4n 12 4- 1122 4- w 13 = 1.5 — 0= 1.5 

"21 *F "12 ” 4« 22 4- u 22 = 3 “ 3 = 0. 


On the right. 1.5 is 12.vy// 2 at (0.5, 1) and 3 is 12 .ry/ 1 2 at (1. 1) and 0 (at P 02 ) and 3 (at P 32 ) are given boundary 
values. We remember that we have not yet used the boundary condition on the upper part of the boundary of 
R . and we also notice that in (2b) we have introduced two more unknowns « 13 . (* 23 - But we can now use that 
condition and get rid of 3 , t / 2 3 by applying the central difference formula for dt*/dy. From (1) we then obtain 
(see Fig. 457b) 


d"l2 _ 

"13 ~ "ll 



hence 


dy ~ 

2 h 

~ "13 _ 

"ll* 

"13 = "n + 3 

^"22 

"23 — "21 



hence 


- « 

dy 

2 h 

= U 23 - 

"21 • 

ll 23 ~ W 21 + 6. 
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Substituting these results into (2b) and simplifying, we have 

2w-ji — 4i^2 ”i* ^22 = 1 *5 3 — 1.5 

2«21 + »12 - 4«22 = 3 - 3 - 6 = -6. 


Together with (2a) this yields, written in matrix form. 


(3) 


"-4 1 1 O' 


’"ll" 


"0.75 ' 


0.75 ’ 

1-401 


“21 


1.125 


1.125 

2 0-4 1 


“12 


1.5 - 3 


-1.5 

0 2 1 -4_ 


_“22_ 


0 — 6 


—6 


(The entries 2 come from « 13 and an d so do —3 and —6 on the right). The solution of (3) (obtained by 
Gauss elimination) is as follows; the exact values of the problem are given in parentheses. 

Uj 2 ~ 0.866 (exact I) u 2 2 = 1.812 (exact 2) 

m u = 0.077 (exact 0.125) i/ 2 i = 0.191 (exact 0.25). ■ 


Irregular Boundary 

We continue our discussion of boundary value problems for elliptic PDEs in a region R 
in the Jty-plane. If R has a simple geometric shape, we can usually arrange for certain 
mesh points to lie on the boundary C of R , and then we can approximate partial derivatives 
as explained in the last section. However, if C intersects the grid at points that are not 
mesh points, then at points close to the boundary we must proceed differently, as follows. 

The mesh point O in Fig. 458 is of that kind. For O and its neighbors A and P we obtain 
from Taylor’s theorem 


(4) 


bu Q 1 0 d 2 u 0 

(a) u. A = Uq + ah —— + — (ah) 2 —§■ 
ox 2 ox 


(b) Up = u 0 — h 


du. 


dx 


0 + 4 h z + 


dx“ 


We disregard the terms marked by dots and eliminate du 0 /dx. Equation (4b) times a plus 
equation (4a) gives 


u A + au P = (1 + a)u Q + — a(a + 1 )li 2 


d z u 0 

dx 2 



Fig. 458. Curved boundary C of a region R, a mesh point O near C, and neighbors A, 8, P, Q 
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EXAMPLE 2 


We solve this last equation algebraically for the derivative, obtaining 

d 2 «o 2 f 1 ,1 I 

3a - 2 “ h 2 [_ a( 1 +a) UA + l+a“ p a 

Similarly, by considering the points 0, B, and <2, 

3 2 Mq 2 r 1 1 I 

dy 2 h z |_ &( 1 + b) l ‘ B + 1 + b Uq b U ° 


By addition, 


( 5 ) 


V 2 


wo 


_ _2_ 
h 2 _ 0 


“a 




(1 + a ) 6(1 + 6 ) 


wp w Q 

1+01+6 


(0 + b)u 0 " 
06 


For example, if 0 = §, 6 = 5 , instead of the stencil (see Sec. 21.4) 


f 

1 




f * 

3 


1 

-4 

1 

► we now have 

< 

2 _4 

3 ^ 

§■ 


I 

> 



2 

l 3 



because 1/[0(1 + 0 )] = §, etc. The sum of all five terms still being zero (which is useful 
for checking). 

Using the same ideas, you may show that in the case of Fig. 459. 


y2 _ A T “a Wb up Uq _ ap + bg 1 

0 h 2 L ^(0 + p) 6(6 + 4 ) p(p + 0 ) c/(<? + 6 ) abpq °J * 

a formula that takes care of all conceivable cases. 


9 B 


bh 

ph O oh 

P O ' . 1 >■ ■« ■■ ■■ 

qh. 

)Q 


o A 


Fig. 459. Neighboring points A, B f P,Q of a 
mesh point O and notations in formula (6) 


Dirichlet Problem for the Laplace Equation. Curved Boundary 

Find the potential u in the region in Fig. 460 that has the boundary values given in that figure; here the curved 
portion of the boundary is an arc of the circle of radius 10 about (0, 0). Use the grid in the figure. 

Solution . u is a solution of the Laplace equation. From the given formulas for the boundary values u = a 3 , 
it — 512 — 24y 2 , • • • we compute the values at the points where we need them; the result is shown in the figure. 
For P n and P 12 we have the usual regular stencil, and for P 2 i and P 2 2 we use (6), obtaining 


1 


0.5 


0.9 

1 -4 1 

- * ^21 : ‘ 

0.6 -2.5 0.9 

P 22 : 

0.6 -3 0.9 

1 


0.5 


0.6 


(7) Pn*P& 
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Fig. 460. Region, boundary values of the 
potential, and grid in Example 2 


We use this and the boundary values and take the mesh points in the usual order P n , P 2 \ , P \ 2 , *22- Then we 
obtain the system 

— 4//n 4 u<2i 4- U12 =0 — 27 = —27 

0.6« n - 2.5u 21 + 0.5 u 22 = -0.9 • 296 - 0,5 • 216 = -374.4 

Kjj — 4j/j 2 + m 22 = 702 4 0 = 702 

0.6u 21 4 0.6 k 12 - 3m 22 = 0.9 • 352 4 0.9 • 936 = 1 159.2. 

Tn matrix form, 


'-4 1 1 0 “ 


"wil“ 


' -27 ~ 

0.6 -2.5 0 0.5 


“21 


-374.4 

10-41 


«12 


702 

0 0.6 0.6 -3 


_ w 22_ 


! 159.2. 


Gauss elimination yields the (rounded) values 

~ 55.6, it 2 i ~ 49.2, u -± 2 — 298.5, m 22 — 436.3. 

Clearly, from a grid with so few mesh points we cannot expect great accuracy. The exact solution of the PDE 
(not of the difference equation) having the given boundary values is u = x 3 4 5 — 1 3jry 2 and yields the values 


Mu — 54, W 21 — 54, Hi 2 — —297, m 22 — —432. 

In practice one would use a much finer grid and solve the resulting large system by an indirect 
method. ■ 


1. Verify the calculation for the Poisson equation in 
Example 1. Check the values for (3) at the end. 

2. Derive (5) in particular when a = b = 

3. Derive the general stencil formula (6) in all detail. 

4. Verify the calculation for the boundary value problem 
in Example 2. 

5. Do Example 1 in the text for V 2 u = 0 with grid and 
boundary data as before. 


MIXED BOUNDARY VALUE PROBLEMS 

6. Solve the mixed boundary value problem for the 
Laplace equation V 2 u = 0 in the rectangle in Fig. 457a 
(using the grid in Fig. 457b) and the boundary 
conditions u x = 0 on the left edge, u x = 3 on the right 
edge, 11 = x 2 on the lower edge, and u = x 2 — 1 on 
the upper edge. 

7. Solve Prob. 6 when u n = 1 on the upper edge and 
u = 1 on the other edges. 
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8. Solve the mixed boundary value problem for the 
Poisson equation V 2 u = 2(a* 2 -1- v 2 ) in the region and 
for the boundary conditions shown in Fig. 461, using 
the indicated grid. 



Fig. 461. Problems 8 and 10 

9. CAS EXPERIMENT. Mixed Problem. Do Example 
1 in the text with finer and finer grids of your choice 
and study the accuracy of the approximate values by 
comparing with the exact solution it = 2 xy 3 . Verify 
the latter. 

10. Solve V 2 m = — 77 2 v sin §77A for the grid in Fig. 461 
and u u ( 1,3) = u y ( 2, 3) = |V243, // = 0 on the other 
three sides of the square. 

IRREGULAR BOUNDARIES 

11. Solve the Laplace equation in the region and for the 
boundary values shown in Fig. 462, using the indicated 
grid. (The sloping portion of the boundary is 
y = 4.5 - a*.) 



Fig. 462. Problem 11 

12. If in Prob. 1 1 the axes are grounded ( u = 0), what 
constant potential must the other portion of the 
boundary have in order to produce 100 volts at P n ? 

13. What potential do we have in Prob. 1 1 if u = 190 volts 
on the axes and it = 0 on the other portion of the 
boundary? 

14. Solve the Poisson equation V 2 h = 2 in the region and 
for the boundary values shown in Fig. 463, using the 
grid also shown in the figure. 



Fig. 463. Problem 14 


21.6 Methods for Parabolic PDEs 

The last two sections concerned elliptic PDEs, and we now turn to parabolic PDEs. Recall 
that the definitions of elliptic, parabolic, and hyperbolic PDEs were given in Sec. 21.4. 
There it was also mentioned that the general behavior of solutions differs from type to 
type, and so do the problems of practical interest. This reflects on numerics as follows. 

For all three types, one replaces the PDE by a corresponding difference equation, but 
for parabolic and hyperbolic PDEs this does not automatically guarantee the convergence 
of the approximate solution to the exact solution as the mesh h -» 0; in fact, it does not 
even guarantee convergence at all. For these two types of PDEs one needs additional 
conditions (inequalities) to assure convergence and stability, the latter meaning that small 
perturbations in the initial data (or small errors at any time) cause only small changes at 
later times. 

In this section we explain the numeric solution of the prototype of parabolic PDEs, the 
one-dimensional heat equation 


u t 


— c^u 
L “ra; 


(c constant). 
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This PDE is usually considered for x in some fixed interval, say, 0 ^ a* ^ L, and time 
t ^ 0, and one prescribes the initial temperature w(a, 0) = /(a) (/ given) and boundary 
conditions at a = 0 and a = L for all t ^ 0, for instance w(0, f) = 0, h(L, /) = 0. We may 
assume c = 1 and L— 1; this can always be accomplished by a linear transformation of 
a and t (Prob. 1). Then the heat equation and those conditions are 


(1) 

Uxx 

o 

All 

VII 

K 

VII 

o 

(2) 

II 

O 

(Initial condition) 

(3) 

m(0, t) = m(1, t) = 0 

(Boundary conditions). 


A simple finite difference approximation of (1) is [see (6a) in Sec. 21.4; j is the number 
of the time step ] 


(4) (iiij+i Uij) H" U}_ ij). 

Figure 464 shows a corresponding grid and mesh points. The mesh size is h in the 
A-direction and k in the f-direction. Formula (4) involves the four points shown in 
Fig. 465. On the left in (4) we have used a forward difference quotient since we have no 
information for negative t at the start. From (4) we calculate u i%j+1 , which corresponds to 
time row j + 1, in terms of the three other u that correspond to time row j. Solving (4) 
for we have 

k 

(5) «ij+ 1 = (1 “ 2 r) Uij + r(u i+lij + u^ u ), r = . 



Fig. 464. Grid and mesh points corresponding to (4), (5) 


<U+ 1) 
x 

h 

<* - 1 , D X X 7 X (/ + l,j) 

h h 

( i t j ) 

Fig. 465. The four points in (4) and (5) 
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Computations by this explicit method based on (5) are simple. However, it can be 
shown that crucial to the convergence of this method is the condition 



That is, Ujj should have a positive coefficient in (5) or (for r = |) be absent from (5). Intuitively, 
( 6 ) means that we should not move too fast in the r-direction. An example is given below. 

Crank-Nicolson Method 

Condition ( 6 ) is a handicap in practice. Indeed, to attain sufficient accuracy, we have to 
choose h small, which makes k very small by ( 6 ). For example, if h = 0.1, then k ^ 0.005. 
Accordingly, we should look for a more satisfactory discretization of the heat equation. 

A method that imposes no restriction on r = klii 2 is the Crank-Nicolson method, 
which uses values of u at the six points in Fig. 466. The idea of the method is the 
replacement of the difference quotient on the right side of (4) by 5 times the sum of two 
such difference quotients at two time rows (see Fig. 466). Instead of (4) we then have 

1 

~ 2/ ? 2 W — 2 u ij + u i- l,j) 

1 

+ 2 h 2 + ~~ 

Multiplying by 2k and writing r = k/h 2 as before, we collect the terms corresponding to 
time row j + 1 on the left and the terms corresponding to time row j on the right: 

(8) (2 F 2/)Wy+j / (Wf+ij+i F “ (2 2) ) Ujj F / (Wj + l,/ “b ^i— l,j)» 

How do we use (8)? In general, the three values on the left are unknown, whereas the 
three values on the right are known. If we divide the A-interval 0 a S I in (1) into 
n equal intervals, we have n — 1 internal mesh points per time row (see Fig. 464, where 
n = 4). Then for j = 0 and i = 1, • • * , n - 1, formula (8) gives a linear system 
of n — 1 equations for the n — 1 unknown values w n , w 2 i» * * * > u n -i.i the first time 
row in terms of the initial values w 0 o? w io> * * * «. w n o an< ^ the boundary values u Ql (= 0), 
u n 1 (= 0). Similarly for j — \,j — 2, and so on; that is, for each time row we have to 
solve such a linear system of n — 1 equations resulting from (8). 

Although /■ = k/h 2 is no longer restricted, smaller r will still give better results. In 
practice, one chooses a k by which one can save a considerable amount of work, without 
making r too large. For instance, often a good choice is r — 1 (which would be impossible 
in the previous method). Then ( 8 ) becomes simply 

(9) 4w itJ - + i — — w i+i,j F j . 

Time row j +1 x X X 

h 

Time row./ X X X 

h h 

Fig. 466. The six points in the Crank- 
Nicolson formulas (7) and (8) 


k 1 - 
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EXAMPLE 1 



i = 0 i = 1 1-2 i' = 3 i = 4 i = 5 


5 

4 

3 

2 

1 

0 


Fig. 467. Grid in Example 1 


Temperature in a Metal Bar. Crank-Nicolson Method, Explicit Method 

Consider a laterally insulated metal bar of length 1 and such that c 2 = 1 in the heal equation. Suppose that the 
ends of the bar are kept at temperature u - 0°C and the temperature in the bar at some instant — call it t — 0 — 
is /(a*) = sin ?r.v. Applying the Crank-Nicolson method with h = 0.2 and r = 1, find the temperature h(a\ /) in 
the bar for 0 ^ ^ 0.2. Compare the results with the exact solution. Also apply (5) with an r satisfying (6), 

say, r = 0.25. and with values not satisfying (6), say, r = 1 and r = 2.5. 

Solution by Crank-Nicolson, Since r = I, formula (8) takes the form (9). Since It = 0.2 and 
r = klh 2 = 1 . we have k = It 2 = 0.04. Hence we have to do 5 steps. Figure 467 shows the grid. We shall need 
the initial values 


« 10 = sin 0.2 t7 = 0.587 785, k 2 o = s ' n 0-4 ?r = 0.951 057. 

Also, « 30 =• U 20 and “40 = "io* (Recall that // 10 means 11 at P l0 in Fig. 467, etc.) In each time row in 
Fig. 467 there are 4 internal mesh points. Hence in each time step we would have to solve 4 equations in 4 
unknowns. But since the initial temperature distribution is symmetric with respect to x = 0.5, and // = 0 at 
both ends for all /, we have » 31 = u 2 \, « 4 i = «u in the first time row and similarly for the other rows. This 
reduces each system to 2 equations in 2 unknowns. By (9), since n 31 = u 2 1 and u 01 = 0, for j = 0 these 
equations are 

(1 = 1) 4f#n //21 = i*oo "b 7, 20 = 0.951 057 

(i = 2) — /in + 4«2i — M 21 = M io ~b “20 ~ 1-538 842. 

The solution is » 1} = 0.399 274, n 21 = 0.646 039. Similarly, for time row j = I we have the system 

(i - 1 ) 4h 12 - «22 = w oi + “21 = 0-646 039 

(i = 2) -u 12 + 3/*22 = “11 + «2i = 1-045 313- 

The solution is m 12 = 0.271 221. « 22 = 0.438 844, and so on. This gives the temperature distribution 
(Fig. 468): 


t 

A = 0 

x = 0.2 

x = 0.4 

X = 0.6 

00 

d 

II 

X = 1 

0.00 

0 

0.588 

0.951 

0.951 

0.588 

0 

0.04 

0 

0.399 

0.646 

0.646 

0.399 

0 

0.08 

0 

0.271 

0.439 

0.439 

0.271 

0 

0.12 

0 

0.184 

0.298 

0.298 

0.184 

0 

0.16 

0 

0.125 

0.202 

0.202 

0.125 

0 

0.20 

0 

0.085 

0.138 

0.138 

0.085 

0 
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Fig. 468. Temperature distribution in the bar in Example 1 


Comparison with the exact solution . The present problem can be solved exactly by separating 
variables (Sec. 12.5); the result is 

(10) u(x, /) = sin 7rx e~ 7 ^ t . 

Solution by the explicit method (5) with r = 0,25. For It = 0.2 and r = kfh 2 = 0.25 we have 
k = rh 2 = 0.25 • 0.04 = 0.0 1 . Hence we have to perform 4 times as many steps as with the Crank-Nicolson 
method! Formula (5) with r = 0.25 is 

(11) = 0.25 (M t -_ l4 + 2uij 4- K* + i fi ). 

We can again make use of the symmetry. For j = 0 we need n 00 = 0, « ]0 = 0.587 785 (see p. 925). 

« 2 o = » 3 o = 0.95 1 057 and compute 

Hji — 0.25 (mqq t 2iti o + ^ 20 ) = 0.531 657 

«2i = 0.25 (wj_o “F - w 20 4" m 3o) = 0.25 (m^q + 3 ^ 20 ) = 0.860 239. 

Of course we can omit the boundary terms « 0 i = 0. h 02 — 0, * * • from the formulas. For J = I we 
compute 

tti 2 ~ 0.25(2i<u t u 2 1) ~ 0.480 888 
U 22 ~ 0.25 («n +■ 3 ^ 21 ) = 0.778 094 


and so on. We have to perform 20 steps instead of the 5 CN steps, but the numeric values show that the accuracy 
is only about the same as that of the Crank-Nicolson values CN. The exact 3D-values follow from (10). 


t 


d 

II 

H 



jc = 0.4 


CN 

By (11) 

Exact 

CN 

By (11) 

Exact 

0.04 

0.399 

0.393 

0.396 

0.646 

0.637 

0.641 

0.08 

0.271 

0.263 

0.267 

0.439 

0.426 

0.432 

0.12 

0.184 

0.176 

0.180 

0.298 

0.285 

0.291 

0.16 

0.125 

0.118 

0.121 

0.202 

0.191 

0.196 

0.20 

0.085 

0.079 

0.082 

0.138 

0.128 

0.132 


Failure of (5) with r violating (6). Formula (5) with h = 0.2 and r = I— which violates (6)— is 

U iJ+l = "i-l.j “ »ij + t*i+ l,j 
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and gives very poor values; some of these are 


t 

x = 0.2 

Exact 

.v = 0.4 

Exact 

0.04 

0.363 

0.396 

0.588 

0.641 

0.12 

0.139 

0.180 

0.225 

0.291 

0.20 

0.053 

0.082 

0.086 

0.132 


Formula (5) with an even larger r = 2.5 (and h = 0.2 as before) gives completely nonsensical results; some of 
these are 


/ 

x = 0.2 

Exact 

a = 0.4 

Exact 

0.1 

0.0265 

0.2191 

0.0429 

0.3545 

0.3 

0.0001 

0.0304 

0.0001 

0.0492. 


zERIgfrL-E 


1. (Nondimensional form) Show that the heat equation 
itf = c 2 ti : 0 ^ x ^ L, can be transformed to the 
“nondimensional” standard form u t = u xx , 0 ^ x ^ 1 , 
by setting x = x/L. / = c 2 7/L 2 , u = il/u 0j where u 0 is 
any constant temperature. 

2. Derive the difference approximation (4) of the heat 
equation. 

3. Derive (5) from (4). 

4. Using the explicit method [(5) with h = 1 and k = 0.5], 
find the temperature at / = 2 in a laterally insulated 
bar of length 10 with ends kept at temperature 0 and 
initial temperature f(x) = x — 0 . 1 . v 2 . 

5. Solve the heat problem (l)-{3) by Crank-Nicolson 

for 0 ^ ^ 0.20 with h = 0.2 and k = 0.04 when 

fix) = .v if 0 ^ x < fix) = I — .v if 5 ^ a ^ 1 . 
Compare with the exact values for t = 0.20 obtained 
from the series (2 terms) in Sec. 12.5. 

6 . Solve Prob. 5 by the explicit method with h = 0.2 and 
k = 0.01. Do 8 steps. Compare the last values with the 
Crank-Nicolson 3S-values 0.107, 0.175 and the exact 
3S-values 0.108, 0.175. 

7. The accuracy of the explicit method depends on 

/* ^). Illustrate this for Prob. 6 , choosing r = \ (and 

h — 0.2 as before). Do 4 steps. Compare the values for 
t = 0.04 and 0.08 with the 3S-values in Prob. 6, which 
are 0.156, 0.254 (/ = 0.04), 0.105, 0.170 (/ = 0.08). 

8 . If the left end of a laterally insulated bar extending 
from x = 0 lo.v = 1 is insulated, the boundary condition 
at x = 0 is w n (0, t) — ujf), t) = 0. Show that in the 
application of the explicit method given by (5), we can 
compute u 0J + x by the formula 

"oj+i = (I - 2r)« 0 j + 2 ru^. 

Apply this with h = 0.2 and r = 0.25 to determine the 
temperature w(.v, /) in a laterally insulated bar extending 
from ,v = 0 to 1 if «(.v, 0 ) = 0 , the left end is insulated 


and the right end is kept at temperature g(t) — sin 
Hint . Use 0 = du 0 j/dx = (u X j - «- ltJ *)/2A. 

9. In a laterally insulated bar of length 1 let the initial 
temperature be fix) = x if 0 ^ x ^ 0 . 2 , 
fix) = 0.25(1 - x) if 0.2 ^ .v ^ 1. Let n(0, /) = 0, 
w(l, /) = 0 for all t. Apply the explicit method with 
h = 0.2, k = 0.01. Do 5 steps. 

10. Solve Prob. 9 for fix) = x if 0 ^ .v ^ 0.5, 

fix) = 1 — .V if 0.5 = ,v = 1 , all the other data being 
as before. Can you expect the solution to satisfy 
nix. r) = m( 1 — .v. t) for all /? 

11. Solve Prob. 9 by (9) with h = 0.2, 2 steps. Compare 
with exact values obtained from the series in Sec. 12.5 
(2 terms) with suitable coefficients. 

12. CAS EXPERIMENT. Comparison of Methods, 

(a) Write programs for the explicit and the 
Crank-Nicolson methods. 


(b) Apply the programs to the heat problem of a 
laterally insulated bar of length 1 with i/( jc, 0) = sin 7 r.v 
and h(0, /) = «( 1 , /) = 0 for all f, using h = 0 . 2 , 
k = 0.0 1 for the explicit method (20 steps), h = 0.2 
and (9) for the Crank-Nicolson method (5 steps). Obtain 
exact 6 D-values from a suitable series and compare. 

(c) Graph temperature curves in (b) in two figures 
similar to Fig. 296 in Sec. 12.6. 

(d) Experiment with smaller h (0. 1 , 0.05, etc.) for both 
methods to find out to what extent accuracy increases 
under systematic changes of h and k. 


13-15 


CRANK-NICOLSON 


Solve (l)-(3) by Crank-Nicolson with r = 1 (5 steps), 
where: 


13. fix) = .v(I - .v), h = 0.2 

14. fix) - .v(l — a), h = 0.1 (Compare with Prob. 13.) 

15. fix) = 5a* if 0 ^ a* < 0.2, fix) = 1.25(1 - x) if 
0.2 ^ x S 1 , /? = 0.2 
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21 .4 Method for Hyperbolic PDEs 

In this section we consider the numeric solution of problems involving hyperbolic PDEs. 
We explain a standard method in terms of a typical setting for the prototype of a hyperbolic 
PDE, the wave equation: 


(1) 

U tt 1{ XX 

0 ^ jc ^ 1, r ^ 0 

(2) 

u(x, 0) = f(x) 

(Given initial displacement) 

(3) 

0) (-^0 

(Given initial velocity) 

(4) 

«(o, o « «(i, o = o 

(Boundary conditions). 


Note that an equation u tt = and another ^-interval can be reduced to the form (1) 
by a linear transformation of x and /. This is similar to Sec. 21.6, Prob. 1. 

For instance, (1)— (4) is the model of a vibrating elastic string with fixed ends at 
x = 0 and x = l (see Sec. 12.2). Although an analytic solution of the problem is given 
in (13), Sec. 12.4, we use the problem for explaining basic ideas of the numeric approach 
that are also relevant for more complicated hyperbolic PDEs. 

Replacing the derivatives by difference quotients as before, we obtain from (1) [see (6) 
in Sec. 21.4 with y = /] 

( 5 ) ~~j ~2 1 + Wfj—x) ^2 ~ 2 ltij + W*— 1,7) 

where h is the mesh size in x, and k is the mesh size in t This difference equation relates 
5 points as shown in Fig. 469a. It suggests a rectangular grid similar to the grids for 
parabolic equations in the preceding section. We choose r* = k 2 /h 2 = 1. Then u ^ drops 
out and we have 

(6) u iJ+1 = Ui-u 4- u i+l j - (Fig. 469b). 

It can be shown that for 0 < r* ^ 1 the present explicit method is stable, so that from 
(6) we may expect reasonable results for initial data that have no discontinuities. (For a 
hyperbolic PDE the latter would propagate into the solution domain — a phenomenon that 
would be difficult to deal with on our present grid. For unconditionally stable implicit 
methods see [El] in App. 1.) 

Equation (6) still involves 3 time steps j — 1 JJ + 1, whereas the formulas in the 
parabolic case involved only 2 time steps. Furthermore, we now have 2 initial conditions. 


X 

I 

Time row j+ 1 

1* 


x—r— x— r"X 

h | h 

Time row j 

h 


X 

Time row j - 1 


(a) Formula (5) (b) Formula (6) 

Fig. 469. Mesh points used in (5) and (6) 
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EXAMPLE 1 


So we ask how we get started and how we can use the initial condition (3). This can be 
done as follows. 

From u t ( a\ 0) = g(x) we derive the difference formula 
1 

(7) — («ii “ Mi _i) = gi , hence m* = u ix - 2kg { 

where g f = g(ih). For f = 0, that is, j = 0, equation (6) is 

Mil = Mi-1.0 + Mi + i,o “ «*.-!• 

Into this we substitute as given in (7). We obtain u n = + M i+1 , 0 - «ii + 2kg { 

and by simplification 

(8) u n = 2 (m^— i,o + M i+lj0 ) + /%. 

This expresses in terms of the initial data. It is for the beginning only. Then use (6). 

Vibrating String, Wave Equation 

Apply the present method with h = k = 0.2 to the problem ( I)-(4), where 

fix) = sin 77. v, g(.v) = 0. 

Solution . The grid is the same as in Fig. 467. Sec. 21.6. except for the values of /, which now are 0.2, 
0.4, • • • (instead of 0.04. 0.08, • • The initial values /v 00 . "iq. * * * are the same as in Example 1. Sec. 21.6. 
From (8) and £(a) = 0 we have 


“il = 3("i-1.0 + «i+l.o)- 


From this we compute, using k 10 = */ 40 = sin 0.2 tt = 0.587 785, u 2 0 = //30 = 0.951 057. 

(/=!) m u = |(z/oo + // 20 ) = \ * 0.951 057 = 0.475 528 

(/ = 2) u 21 = ^(// 10 + // 30 ) = i ‘ 1.538 842 = 0.769 421 

and m 31 = u 2 i* ll 4 i = 11 il by symmetry as in Sec. 21.6, Example 1. From (6) with j = I we now compute, 
using m 01 = «02 = * * • = 0. 

(/ = 1) zz 12 = i# 0 i + "21 “ »io = 0- 7< 59 421 - 0.587 785 = 0.181 636 

(/ = 2) «22 = u n + u 31 - u 20 = 0.475 528 -f 0.769 421 - 0.951 057 = 0.293 892, 

and u 32 = « 22 , "42 = u 12 by symmetry; and so on. We thus obtain the following values of the displacement 
t/( a\ r) of the string over the first half-cycle: 


t 

-V = 0 

>. 

II 

o 

io 

x = 0.4 


x = 0.8 

* = 1 

0.0 

0 

0.588 

0.951 

0.951 

0.588 

0 

0.2 

0 

0.476 

0.769 

0.769 

0.476 

0 

0.4 

0 

0.182 

0.294 

0.294 

0.182 

0 

0.6 

0 

-0.182 

-0.294 

-0.294 

-0.182 

0 

0.8 

0 

-0.476 

-0.769 

-0.769 

-0.476 

0 


0 

-0.588 

-0.951 

-0.951 

-0.588 

0 
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These values are exact to 3D (3 decimals), the exact solution of the problem being (see Sec, 12.3) 


ff( .V, t) = Sin 7T-V COS 7TI. 

The reason for the exactness follows from d’Alembert's solution (4). Sec. 12.4. (See Prob. 4, below.) M 

This is the end of Chap. 21 on numerics for ODEs and PDEs, a rapidly developing field 
of basic applications and interesting research, in which large-scale and complicated 
practical problems can now be attacked and solved by the computer. 



VIBRATING STRING 

Solve ( I )-(4) by the present method with h = k = 0.2 for 
the given initial deflection f(x) and initial velocity 0 on the 
given /-interval. 

1. /(.v) = 0.01 a*(1 - a), 0 ^ / ^ 2 

2. /(.v) = .v 1 2 3 4 5 6 7 8 9 ( 1 - .v). 0 ^ / ^ 1 

3. /(.v) = .v if 0 a ^ 0.2, f{x) = 0.25(1 - a) if 

0.2 < a 1 

4. Show that from d'Alembert’s solution (13) in Sec. 1 2.4 
with c = 1 it follows that (6) in the present section 
gives the exact value u i j+l = //(/7z, (y + l)/i). 

5. Tf the string governed by the wave equation (1) starts 
from its equilibrium position with initial velocity 
g(.v) = sin 77.v, what is its displacement at time / = 0.4 
and a* = 0.2, 0.4. 0.6. 0.8? (Use the present method 
with h = 0.2, k = 0.2. Use (8). Compare with the exact 
values obtained from (12) in Sec. 12.4.) 


6. Compute approximate values in Prob. 5, using a finer 
grid (/? = 0.1, k — 0.1), and notice the increase in 
accuracy. 

7. Illustrate the starting procedure when both f and g 
are not identically zero, say, /(a) = I - cos 2tta, 
5 (a) = a — a 2 . Choose h = k = 0. 1 and do 2 time steps. 

8. Show that (12) in Sec. 12.4 gives as another starting 
formula 



(where one can evaluate the integral numerically if 
necessary). In what case is this identical with (8)? 

9. Compute u in Prob. 7 for / = 0.1 and x = 0.1, 

0.2, • • • , 0.9, using the formula in Prob. 8, and 
compare the values. 

10. Solve (1M3) (// = k = 0.2, 5 time steps) subject to 
fix) = a 2 , g( A) = 2a, u x { 0 , t) = 2/, «(1, 0 = (1+ tf. 




1. Explain the Euler and Improved Euler methods in 
geometrical terms, 

2. What are the local and global orders of a method? Give 
examples. 

3. What do you know about error estimates? Why are they 
important? 

4. How did we obtain numeric methods by using the 
Taylor series? 

5. In each Runge-Kutta step we computed auxiliary 
values. How many? Why? 

6. What are one-step and multistep methods? Give 
examples. 

7. What is the idea of a predictor-corrector method? 
Mention some of these methods. 

8. What is the idea of the Rungc— Kutta— Fehlberg method? 

9. How can Runge-Kutta be generalized to systems of 
ODEs? 


TIONS AND PROBLEMS 


10. What is automatic step size control? How is it done in 
practice? 

11. Why and how did we use finite differences in this 
chapter? 

12. Make a list of types of PDEs, corresponding problems, 
and methods for their numeric solution. 

13. How did we approximate the Laplace equation? The 
Poisson equation? 

14. Will a difference equation give exact solutions of a PDE? 

15. How did we handle (a) irregularly shaped domains, 
(b) given normal derivatives at the boundary? 

16. Solve y f = 2a v, y(0) = I, by the Euler method with 
h = 0,1, 10 steps. Compute the error. 

17. Solve y* = l -t- y 2 , v(0) = 0, by the improved Euler 
method with h = 0.1, 5 steps. Compute the enror. 

18. Solve / = (a + .v - 4 ) 2 , >-(0) = 4. by RK with 
h = 0.2, 7 steps. 
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19. Solve Prob. 17 by RK with h = 0.1, 5 steps. Compute 
the error. Compare with Prob. 1 7. 

20. (Fair comparison) Solve y - 2x~ l Vy — inx 4* x“\ 
y(l) = 0 for 1 = a* = 1.8 (a) by tlie Euler method with 
h = 0.1, (b) by the improved Euler method with 
h = 0.2. (c) by RK with h = 0.4. Verify that the exact 
solution is y = (In a*) 2 + In x. Compute and compare 
the errors. Why is the comparison fair? 

21. Compute e x for jt = 0. 0.1, • • • . 1.0 by applying RK 
to / = y 9 y(0) = I. h = 0.1. Show that the result is 
5D-exact. 

22. Solve v' = (a* + v) 2 . y(0) = 0 by RK with h = 0.2. 
5 steps. 

23. Show that by applying the method in Sec. 21.2 to a 
polynomial of first degree we obtain the multistep 
predictor and corrector formulas 

v «+ 1 = y„ + y (3 /„ - / u _i) 

)’n I 1 =J„ + | (/«* 1 + fn) 


where /* , , = f(x n+l , 

24. Apply the multistep method in Prob. 23 to the initial 
value problem y' = a* + y. y(0) = 0, choosing h = 0.2 
and doing 5 steps. Compare with the exact values. 

25. Solve >•' = (y - .v - l) 2 + 2. y(0) = 1 for 0 £ .v £ 1 
by Adams-Moulton with h = 0.1 and starting values 1, 
1.200334589, 1.402709878, 1.609336039. 

26. Solve y" + y = 0. y(0) = 0, y'(0) = 1 by RKN with 
h = 0.2, 5 steps. Find the error. 

27. Solve y[ = -4y r + 3y 2 . y 2 = 5y, - 6y 2 , y x (0) = 3, 
y 2 (0) = -5, by RK for systems, h = 0.1, 5 steps. 

28. Solve y[ = — 5y x + 3y 2 , y 2 = —3 y x — 5y 2 , Vj(0) = 2. 
y 2 (0) = 2 by RK for systems, h = 0.1, 5 steps. 

29. Find rough approximate values of the electrostatic 
potential at P lv P 12 , P 13 in Fig. 470 that lie in a field 
between conducting plates (in Fig. 470 appearing as 
sides of a rectangle) kept at potentials 0 and 1 1 0 volts 
as shown. (Use the indicated grid.) 



0 


Fig. 470. Problem 29 


30-32 


POTENTIAL 


Find the potential in Fig. 47 1 , using the given grid and the 
boundary values: 

30. u = 70 on the upper and left sides, u = 0 on the lower 
and right sides 

31. u(P 10 ) = «(P 30 ) = 960, u(P 2 0 ) = -480, u = 0 
elsewhere on the boundary 


32. u(Po\) = u(P o:i ) = i/(P 4 i) = «(F 43 ) = 200, 

u(P\o) = u(P 3 o) = -400, u(P 2 0 ) = 1600, 

w(F, 02 ) = “(^ 42 ) = «(F 14 ) = u(P 24 ) = 

^(^34) = 0 



| P]3 c 

P * « 



, p >* < 

P 22 . 

P Z2 , 


V 

, p n , 

of 

P 3 1 , 






► P p 
10 20 30 


Fig. 471. Problems 30-32 


33. Verify (13) in Sec. 21.4 for the system (12) and show 
that A in ( 1 2) is nonsingular. 

34. Derive the difference approximation of the heat equation. 

35. Solve the heat equation (1), Sec. 21.6, for the initial 
condition f(x) = .v if 0 ^ a ^ 0.2. f(x) = 0.25(1 - ,v) 
if 0.2 < .v ^ I and boundary condition (3), Sec. 21.6, 
by the explicit method [formula (5) in Sec. 21.6] with 
h = 0.2 and A: = 0.01 so that you get values of the 
temperature at time / = 0.05 as the answer. 

36. A laterally insulated homogeneous bar with ends at 
a* = 0 and a: = 1 has initial temperature 0. Its left end 
is kept at 0, whereas the temperature at the right end 
varies sinusoidally according to 

u(t, 1) = g(t) = sin ^irt. 

Find the temperature w(a\ /) in the bar [solution of (I) 
in Sec. 21.6] by the explicit method with h = 0.2 and 
r = 0.5 (one period, that is, 0 ^ t ^ 0.24). 

37. Find u(x, 0.12) and m(.y, 0.24) in Prob. 36 if the left end 
of the bar is kept at — g[t) (instead of 0), all the other 
data being as before. 

38. Find out how tlie results of Prob. 36 can be used for 
obtaining the results in Prob. 37. Use the values 0.054, 
0.172, 0.325, 0.406 (/ = 0.12, x = 0.2, 0.4, 0.6, 0.8) and 
-0.009, -0.086, -0.252, -0.353 (/ = 0.24) from the 
answer to Prob. 36 to check your answer to Prob. 37. 

39. Solve u t = u xx (0 = a* = 1, / = 0), 

h(a*, 0) = ,y 2 ( I - .v). w( 0. /) = i/(l, /) = 0 by 
Crank-Nicolson with h = 0.2, k = 0.04. 5 time steps. 

40. Find the solution of die vibrating string problem u tt = 
w(a', 0) = a:(1 - a), u t = 0, m(0, /) = m(I, /) = 0 by the 
method in Sec. 21.7 with h = 0.1 and k - 0. 1 for t = 0.3. 





In this chapter we discussed numerics for ODEs (Secs. 21.1-21.3) and PDEs 
(Secs. 21.4—21.7). Methods for initial value problems 


(1) /=/(*,>•), y(x 0 ) = y Q 

involving a first-order ODE are obtained by truncating the Taylor series 


y(x + h) = y(x) + hy\x) + — y\x) + • • • 

where, by (1), y' = /, y" = f' = df/d. x 4- (df/dy)y\ etc. Truncating after the term 
fry', we get the Euler method , in which we compute step by step 


y n +i = >*n + hf{x m y n ) 


(/? = 0 , 1 , • • •)• 


Taking one more term into account, we obtain the improved Euler method. Both 
methods show the basic idea but are too inaccurate in most cases. 

Truncating after the term in /i 4 , we get the important classical Runge-Kutta (RK) 
method of fourth order. The crucial idea in this method is the replacement of the 
cumbersome evaluation of derivatives by the evaluation of f(x, y) at suitable points 
(a\ y); thus in each step we first compute four auxiliary quantities (Sec. 21.1) 

*i = 3’n) 

k 2 = hf(x n + \K y n + ^x) 

(3a) 

k z = hf(x n 4- \lu y n + \k 2 ) 
k 4 = hf(x n + lu y n + k z ) 

and then the new value 

(3b) y n +L = y n + $(k x + 2k 2 4- 2k z 4- /: 4 ). 

Error and step size control are possible by step halving or by RKF 
(Runge-Kutta-Fehlberg). 

The methods in Sec. 21.1 are one-step methods since they get y n+1 from the 
result y n of a single step. A multistep method (Sec. 21.2) uses the values of 
y n , ;y n _i, • • * of several steps for computing y ?l+1 . hitegrating cubic interpolation 
polynomials gives the Adams-Bashforth predictor (Sec. 21.2) 


.v*+i =y n + ~ 7 h(55f n - 5 9/n_! + 37/ n _ 2 - 9/ n _ 3 ) 
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where fj = f(xj, v,), and an Adams-Moulton corrector (the actual new value) 

(4b) y n+ 1 = y n + ^ /?(9/* +I + 19 /„ - 5/ n _! + / n _ 2 ), 

where /* +1 = f(x n+1 , y* +1 ). Here, to get started, y lt y 2 , .V 3 must be computed by 
the Runge-Kutta method or by some other accurate method. 

Section 19.3 concerned the extension of Euler and RK methods to systems 

y' = f(.v, y), thus yj = a\ y v • , y m ), j = 1, ■ - * , m. 

This includes single rath order ODEs, which are reduced to systems. Second-order 
equations can also be solved by RKN (Runge-Kutta-Nystrom) methods. These are 
particularly advantageous for y" = f(x> y) with f not containing y . 

Numeric methods for PDEs are obtained by replacing partial derivatives by 
difference quotients. This leads to approximating difference equations, for the 
Laplace equation to 

(5) u i+hj + m u+1 + Ui- u + «ij- 1 - 4 Uij = 0 (Sec. 21.4) 
for the heat equation to 

(6) J x - Uij) = -^2 (Ui+xj - 2 Uij + Ui-u) (Sec. 21.6) 
and for the wave equation to 

(7) ^2 (fyj+i 2ujj + Ujj—i) — *^2 (Hi+i,j ~~ 2itjj ”r Wj_jj) (Sec. 21.7), 

here h and k are the mesh sizes of a grid in the jc- and v-directions, respectively, 
where in (6) and (7) the variable y is time /. 

These PDEs are elliptic, parabolic , and hyperbolic , respectively. Corresponding 
numeric methods differ, for the following reason. For elliptic PDEs we have 
boundary value problems, and we discussed for them the Gauss-Seidel method 
(also known as Liebmann’s method) and the ADI method (Secs. 21.4, 21.5). For 
parabolic PDEs we are given one initial condition and boundary conditions, and we 
discussed an explicit method and the Crank-Nicolson method (Sec. 21.6). For 
hyperbolic PDEs, the problems are similar but we are given a second initial condition 
(Sec. 21.7). 





PART 



Optimization, 

Graphs 


CHAPTER 22 Unconstrained Optimization* Linear Programming 

CHAPTER 23 Graphs* Combinatorial Optimization 

Ideas of optimization and application of graphs play an increasing role in engineering, 
computer science, systems theory, economics, and other areas. In the first chapter of tills 
part we explain some basic concepts, methods, and results in unconstrained and constrained 
optimization. The second chapter is devoted to graphs and the corresponding so-called 
combinatorial optimization, a relatively new interesting area of ongoing applied and 
theoretical research. 
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CHAPTER 2 2 

Unconstrained Optimization. 
Linear Programming 


Optimization principles are of basic importance in modem engineering design and systems 
operation in various areas. The recent development has been influenced by computers 
capable of solving large-scale problems and by the creation of corresponding new 
optimization techniques, so that the entire field has become a large area of its own. 

In the present chapter we give an introduction to the more important concepts, methods, 
and results on unconstrained optimization (the so-called gradient method) and constrained 
optimization (linear programming). 

Prerequisite: a modest working knowledge of linear systems of equations 

References and Answers to Problems: App. 1 Part F, App. 2. 


22.1 Basic Concepts. 

Unconstrained Optimization 

In an optimization problem the objective is to optimize (maximize or minimize) some 
function /. This function / is called the objective function. 

For example, an objective function f to be maximized may be the revenue in a production 
of TV sets, the yield per minute in a chemical process, the mileage per gallon of a certain 
type of car, the hourly number of customers served in a bank, the hardness of steel, or 
the tensile strength of a rope. 

Similarly, we may want to minimize f if / is the cost per unit of producing certain 
cameras, the operating cost of some power plant, the daily loss of heat in a heating system, 
the idling time of some lathe, or the time needed to produce a fender. 

In most optimization problems the objective function f depends on several variables 

a'i • • • , x*. 

These are called control variables because we can “control” them, that is, choose their values. 

For example, the yield of a chemical process may depend on pressure x\ and temperature 
.v 2 . The efficiency of a certain air-conditioning system may depend on temperature x x , air 
pressure a* 2 , moisture content a 3 , cross-sectional area of outlet jc 4 , and so on. 

Optimization theory develops methods for optimal choices of x l9 • • • , x n> which 
maximize (or minimize) the objective function /, that is, methods for finding optimal 
values of x x , * • • , x n . 
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In many problems the choice of values of a* 1; - • • , jc n is not entirely free but is subject 
to some constraints, that is, additional restrictions arising from the nature of the problem 
and the variables. 

For example, if x x is production cost, then x x ^ 0, and there are many other variables 
(time, weight, distance traveled by a salesman, etc.) that can take nonnegative values only. 
Constraints can also have the form of equations (instead of inequalities). 

We first consider unconstrained optimization in the case of a function fix l9 • • • , A* n ). 
We also write x = (x l9 • • • , x n ) and fix), for convenience. 

By definition, f has a minimum at a point x = X 0 in a region R (where f is defined) 
if 

f(x) ^ /(X 0 ) 

for all x in R. Similarly, f has a maximum at X 0 in R if 

fix) ^ /(X 0 ) 

for all x in R. Minima and maxima together are called extrema. 

Furthermore, f is said to have a local minimum at X 0 if 

fix) ^ fiX 0 ) 

for all x in a neighborhood of X 0 , say, for all x satisfying 

|x - X 0 | = t(.Vi - x,f + ••• + (*„- X n f] m < r, 

where X 0 = (X l9 • • ■ , X n ) and r > 0 is sufficiently small. 

Similarly, f has a local maximum at X 0 if fix) ^ /(X 0 ) for all x satisfying 
|x - X 0 | < r. 

If / is differentiable and has an extremum at a point X 0 in the interior of a region R 
(that is. not on the boundary), then the partial derivatives dffd uc lt • • * , dfid x n must be 
zero at X 0 . These are the components of a vector that is called the gradient of / and 
denoted by grad f or V/. (For n = 3 this agrees with Sec. 9.7.) Thus 

(1) V/(X 0 ) = 0. 


A point X 0 at which (1) holds is called a stationary point of /. 

Condition (1) is necessary for an extremum of f at X 0 in the interior of R , but is not 
sufficient. Indeed, if n = 1, then for y = /(a*), condition (1) isv' = f'iX 0 ) = 0; and, for 
instance, y = a * 3 satisfies y — 3a 2 = 0 at x* = X 0 = 0 where / has no extremum but a 
point of inflection. Similarly, for fix) = x h x 2 we have V/(0) = 0, and / does not have 
an extremum but has a saddle point at 0. Hence after solving (1), one must still find out 
whether one has obtained an extremum. In the case n = 1 the conditions y'iX 0 ) = 0, 
y"(X 0 ) > 0 guarantee a local minimum at X 0 and the conditions /(X 0 ) = 0, y"iX 0 ) < 0 
a local maximum, as is known from calculus. For n > 1 there exist similar criteria. 
However, in practice even solving (1) will often be difficult. For this reason, one generally 
prefers solution by iteration, that is, by a search process that starts at some point and 
moves stepwise to points at which / is smaller (if a minimum of / is wanted) or larger 
(in the case of a maximum). 
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EXAMPLE 1 


The method of steepest descent or gradient method is of this type. We present it here 
in its standard form. (For refinements see Ref. [E25] listed in App. 1.) 

The idea of this method is to find a minimum of /(x) by repeatedly computing minima 
of a function g(t) of a single variable /, as follows. Suppose that f has a minimum at X 0 
and we start at a point x. Then we look for a minimum of f closest to x along the straight 
line in the direction of — V/(x), which is the direction of steepest descent (= direction of 
maximum decrease) of / at x. That is, we determine the value of t and the corresponding 
point 

(2) z(t) = x - fV/(x) 
at which the function 

(3) g(t) = f(z(t)) 

has a minimum. We take this z(t) as our next approximation to X 0 . 

Method of Steepest Descent 

Determine a minimum of 

(4) m = .V! 2 + 3.y 2 2 , 

starling from x 0 = (6, 3) = 6i + 3j and applying the method of steepest descent. 

Solution . Clearly, inspection shows that /(x) has a minimum at 0. Knowing the solution gives us a better 
feel of how the method works. We obtain V/(x) = 2.^1 + 6.v 2 j and from this 

z </) = x - /V/(x) = (l - 2/ ).v 1 i + (I - 6r)A- 2 j 

g(') = /WO) = (1 - 20 V + 3(1 - 60 2 .V 2 2 . 

We now calculate the derivative 

g'(0 = 2(1 - 20-v, 2 (-2) + 6(1 - 60-v 2 2 (-6), 
set g'(0 = 0, and solve for r. finding 

_ .V| 2 + 9.v 2 2 

' ~ lv, 2 + 54 .v 2 2 ' 

Starting from x 0 = 6i + 3j, we compute the values in Table 22.1. which are shown in Fig. 472. 

Figure 472 suggests that in the case of slimmer ellipses (“a long narrow valley”), convergence would be poor. 
You may confirm this by replacing the coefficient 3 in (4) with a large coefficient. For more sophisticated 
descent and other methods, some of them also applicable to vector functions of vector variables, we refer to the 
references listed in Part F of App. 1 ; see also [E251. ■ 



Fig. 472. Method of steepest descent in Example 1 
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Table 22.1 Method of Steepest Descent, Computations in Example 1 


n 


X 

/ 

1 - 2r 

1 - 6/ 

0 

6.000 

3.000 

0.210 

0.581 

-0.258 

1 

3.484 

-0.774 

0.310 

0.381 

-0.857 

2 

1.327 

0.664 

0.210 

0.581 

-0.258 

3 

0.771 

-0.171 

0.310 

0.381 

-0.857 

4 

0.294 

0.147 

0.210 

0.581 

-0.258 

5 

0.170 

-0.038 

0.310 

0.381 

-0.857 

6 

0.065 

0.032 









1. What happens if you apply the method of steepest 
descent to f(x) = a* 2 4 a 2 2 ? 

2. Verify that in Example l, successive gradients are 
orthogonal. What is the reason? 

1 3-11 1 STEEPEST DESCENT 

Do 3 steepest descent steps when: 

3. fix) = 3a‘ x 2 + 2 a* 2 2 — 12a*! + 16a 2 , x 0 = [1 1] T 

4. f(x) = Ai 2 + 2 a 2 2 - x t - 6a 2 , x 0 = [0 0] T 

5. /(x) = 0.5x t 2 + 0.7 a 2 2 - + 4.2^2 + 1. 

A-o = [- 1 1] T 

6. f(x) = a, 2 + 0.1x 2 2 + 8 aj + a 2 + 22.5. 

x 0 =[2 -1] T 

7. /(x) = 0.2X! 2 + a 2 2 - 0.08aj, a 0 = [4 4] T 

8. /(x) = x 2 — .v 2 2 , x 0 = [2 1 ] T , 5 steps. First guess. 
Then compute. Sketch your path. 


9. fix) - A*i 2 + cjc 2 2 , x 0 = [c l] T . Show that 2 steps 
give [c 1] T times a factor, -4c 2 /(c 2 - l) 2 . What 
can you conclude from this about the speed of 
convergence? 

10. fix) = a*! 2 - a*2% x 0 = [1 1] T . Sketch your path. 

Predict the outcome of further steps. 

11. f(x) = ax r 4- bx 2 . any x 0 . First guess, then compute. 

12. CAS EXPERIMENT. Steepest Descent, (a) Write a 
program for the method. 

(b) Apply your program to fix) = A* a 2 4- 4 a* 2 2 , 
experimenting with respect to speed of convergence 
depending on the choice of x 0 . 

(c) Apply your program to fix) = x* 4- a* 2 4 and to 
fix) = Aj 4 4 a 2 4 Xq = [2 l] T . Graph level curves 
and your path of descent. (Try to include graphing 
directly in your program.) 


22.2 Linear Programming 

Linear programming or linear optimization consists of methods for solving optimization 
problems with constraints, that is, methods for finding a maximum (or a minimum) 
x = [a 1t • • • , a„] of a linear objective function 

z = fix) = a x Xi + a z x 2 + • • • + a n Xn 

satisfying the constraints. The latter are linear inequalities, such as 3x t + 4x 2 = 36, or 
a'i S 0, etc. (examples below). Problems of this kind arise frequently, almost daily, for 
instance, in production, inventory management, bond trading, operation of power plants, 
routing delivery vehicles, airplane scheduling, and so on. Progress in computer technology 
has made it possible to solve programming problems involving hundreds or thousands or 
more variables. Let us explain the setting of a linear programming problem and the idea 
of a “geometric” solution, so that we shall see what Is going on. 
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EXAMPLE 1 


Production Plan 

Energy Savers, Inc., produces heaters of types S and L The wholesale price is $40 per heater for S and $88 for 
L. Two time constraints result from the use of two machines Mi and A/ 2 . On Mi one needs 2 min for an S heater 
and 8 min for an L heater. On M z one needs 5 min for an S heater and 2 min for an L heater. Determine 
production figures x x and .v 2 for S and L respectively (number of heaters produced per hour) so that the hourly 
revenue 

z = fix) = 40.VJ + 88.v 2 

is maximum. 

Solution . Production figures x x and .v 2 must be nonnegative. Hence the objective function (to be maximized) 
and the four constraints are 


<0) 

z = 40.x*! + 88.v 2 

(I) 

2xi + 8.y 2 ^ 60 min time on machine M x 

(2) 

5.V! + 2y 2 Si 60 min time on machine M 2 

(3) 

Xi ^ 0 

(4) 

o 

All 

£ 


Figure 473 shows (0)-(4) as follows. Constancy lines 


z = const 


are marked (0). These are lines of constant revenue. Their slope is -40/88 = —5/Ll. To increase z we must 
move the line upward (parallel to itself), as the arrow shows. Equation (1) with the equality sign is marked 

(1) . It intersects the coordinate axes at x ± = 60/2 = 30 (set .v 2 = 0) and a* 2 = 60/8 = 7.5 (set x x = 0). The 
arrow marks the side on which the points (x^ .v 2 ) lie that satisfy the inequality in (1). Similarly for Eqs. 

(2) -(4). The blue quadrangle thus obtained is called the feasibility region. It is the set of all feasible 
solutions, meaning solutions that satisfy all four constraints. The figure also lists the revenue at O, A , B , C. 
The optimal solution is obtained by moving the line of constant revenue up as much as possible without 
leaving the feasibility region completely. Obviously, this optimum is reached when that line passes through 
j B, the intersection (10, 5) of (1) and (2). We see that the optimal revenue 

z max = 40 * 10 + 88 ■ 5 = $840 

is obtained by producing twice as many S heaters as L heaters. M 



Fig. 473. Linear programming in Example 1 
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EXAMPLE 2 


Note well that the problem in Example 1 or similar optimization problems cannot be 
solved by setting certain partial derivatives equal to zero, because crucial to such problems 
is the region in which the control variables are allowed to vary. 

Furthermore, our “geometric” or graphic method illustrated in Example 1 is confined 
to two variables a* 15 jc 2 . However, most practical problems involve much more than two 
variables, so that we need other methods of solution. 

Normal Form of a Linear Programming Problem 

To prepare for general solution methods, we show that constraints can be written more 
uniformly. Let us explain the idea in terms of (l). 


2a*! + 8a*2 = 60. 


This inequality implies 60 - 2x 1 - 8a* 2 = 0 (and conversely), that is, the quantity 

x 3 = 60 — 2xi — 8a* 2 

is nonnegative. Hence, our original inequality can now be written as an equation 


where 


2a*! H- 8a* 2 “b a*3 — 60, 


A3 ^ 0. 


a' 3 is a nonnegative auxiliary variable introduced for converting inequalities to equations. 
Such a variable is called a slack variable, because it “takes up the slack” or difference 
between the two sides of the inequality. 


Conversion of Inequalities by the Use of Slack Variables 


With the help of two slack variables * 3 , * 4 we can write the linear programming problem in Example 
following form. Maximize 

f — 40*! + 88*2 


subject to the constraints 


2*! + 8*2 + *3 = 60 


in the 


5*1 -I- 2*2 + *4 = 60 

*! ^ 0 (/ = 1, • • • , 4). 

We now have n = 4 variables and m = 2 (linearly independent) equations, so that two of the four variables, 
for example, * t , * 2 , determine the others. Also note that each of the four sides of the quadrangle in Fig. 473 
now has an equation of the form *$ = 0: 


OA: * 2 = 0, 
AB: * 4 = 0, 
BC: *3 = 0, 
CO: *i = 0. 


A vertex of the quadrangle is the intersection of two sides. Hence at a vertex, n - m = 4 - 2 = 2 of the 
variables are zero and the others are nonnegative. Thus at A we have * 2 = 0* a 4 = 0, and so on. ■ 
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Our example suggests that a general linear optimization problem can be brought to the 
following normal form. Maximize 

(5) f = CjA'x + c 2 x 2 + • • * + c n x n 

subject to the constraints 


a n xi + • • • + a ln x n = b x 
^ 21^*1 1" * * * "h a 2n x n b 2 

( 6 ) 

^mlAl ^mn^n 

S 0 (/=!••■■• n) 

with all bj nonnegative. (If a bj < 0, multiply the equation by — 1.) Here x l9 • • • , x n 
include the slack variables (for which the c/s in f are zero). We assume that the equations 
in (6) are linearly independent. Then, if we choose values for n — m of the variables, the 
system uniquely determines the others. Of course, since we must have 

A *1 ^ 0, • • • , x n ^ 0, 


this choice is not entirely free. 

Our problem also includes the minimization of an objective function / since this 
corresponds to maximizing -/ and thus needs no separate consideration. 

An //-tuple (a* x , • • • , x n ) that satisfies all the constraints in (6) is called a feasible point 
or feasible solution. A feasible solution is called an optimal solution if for it the objective 
function / becomes maximum, compared with the values of f at all feasible solutions. 

Finally, by a basic feasible solution we mean a feasible solution for which at least 
n — m of the variables jc lf • • • , x n are zero. For instance, in Example 2 we have n = 4, 
m = 2, and the basic feasible solutions are the four vertices 0> A, B , C in Fig. 473. Here 
B is an optimal solution (the only one in this example). 

The following theorem is fundamental. 


THEOREM 1 


Optimal Solution 

Some optimal solution of a linear programming problem (5), (6) is also a basic 
feasible solution of (5), (6). 


For a proof, see Ref. [F5], Chap. 3 (listed in App. 1). A problem can have many optimal 
solutions and not all of them may be basic feasible solutions; but the theorem guarantees 
that we can find an optimal solution by searching through the basic feasible solutions 


only. This is a great simplification; but since there are 



different ways 


of equating n - m of the n variables to zero, considering all these possibilities, dropping 
those which are not feasible and then searching through the rest would still involve very 
much work, even when n and m are relatively small. Hence a systematic search is needed. 
We shall explain an important method of this type in the next section. 
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1. What is the meaning of the slack variables * 3 , * 4 in 
Example 2 in terms of the problem in Example 1 ? 

2. Can we always expect a unique solution (as is the case 
in Example 1)? 

3. Could we find a profit /(xj, x 2 ) = a x x x + a 2 x 2 whose 
maximum is at an interior point of the quadrangle in 
Fig. 473? (Give a reason for your answer.) 

4. Why are slack variables always nonnegative? How 
many of them do we need? 


12. / = 3xi — 6jc 2 , 4*j +x 2 = 4, 

— *! + 2*2 = 6, *i + 2*2 = 14 

13. / = 2*! + 3*2 » 4*i + 3*2 =12, 

*1 — *2 = “3, *2 = 6, 2*! — 3*2 = 0 

14. Minimize f in Prob. 13. 

15. Minimize / in Prob, 1 1. 


5-10 


REGIONS AND CONSTRAINTS 


Describe and graph the region in the first quadrant of the 
*i* 2 _ plane determined by the inequalities: 

5. *i + 2*2 = 10 

6. — Xi + x 2 = 

0 

*1 *2 — 0 

.V] + *2 S 

5 

* 2 ^ 2 

— 2*1 + A- 2 S 

16 


7. 2.0*! + 6.0* 2 ^ 18.0 
5.0*! + 2.5*2 ^ 20.0 


8. 2*! — * 2 = 6 


4*! H- 5*2 = 40 
*! — 2*2 = —3 


9. *! + * 2 ^ 3 

*1 + *2 ^ 9 

— *1 + *2 = — 3 
“*1 + *2 ^ 3 

10 . *! + *2 ^ 2 
3 *! + 5*2 = 15 
2 *! - *2 = — 2 
— *! + 2*2 = 10 


11-15 


MAXIMIZATION AND MINIMIZATION 


Maximize the given objective function / subject to the 
given constraints. 

11 . / = - 10 *! + 2 * 2 , *, ^ 0 , * 2 = 0 , 

-*1 + *2 = “ I » *i + * 2 = 6, *2 = 5 


16. (Maximum output) Giant Ladders, Inc., wants to 
maximize its daily total output of large step ladders by 
producing x x of them by a process P 1 and x 2 by a 
process P 2 , where Pi requires 2 hours of labor and 4 
machine hours per ladder, and P 2 requires 3 hours of 
labor and 2 machine hours. For this kind of work, 1200 
hours of labor and 1600 hours on the machines are at 
most available per day. Find the optimal x t and * 2 . 

17. (Maximum profit) Universal Electric, Inc., 
manufactures and sells two models of lamps, L x and 
L 2 , the profit being $150 and $100, respectively. The 
process involves two workers W x and W 2 who are 
available for this kind of work 100 and 80 hours per 
month, respectively. W x assembles L x in 20 min and 
L 2 in 30 min. W 2 paints L x in 20 min and L 2 in 10 min. 
Assuming that all lamps made can be sold without 
difficulty, determine production figures that maximize 
the profit. 

18. (Minimum cost) Hardbrick, Inc., has two kilns. Kiln 
I can produce 3000 grey bricks, 2000 red bricks, and 
300 glazed bricks daily. For Kiln II the corresponding 
figures are 2000, 5000, and 1500. Daily operating costs 
of Kilns I and U are $400 and $600, respectively. Find 
the number of days of operation of each kiln so that 
the operation cost in filling an order of 18000 grey, 
34000 red, and 9000 glazed bricks is minimized. 

19. (Maximum profit) United Metal, Inc., produces alloys 
Pi (special brass) and B 2 (yellow tombac). B t contains 
50% copper and 50% zinc. (Ordinary brass contains 
about 65% copper and 35% zinc.) B 2 contains 75% 
copper and 25% zinc. Net profits are $120 per ton of 
B 1 and $100 per ton of S 2 . The daily copper supply is 
45 tons. The daily zinc supply is 30 tons. Maximize 
the net profit of the daily production. 

20. (Nutrition) Foods A and B have 600 and 500 calories, 
contain 15 g and 30 g of protein, and cost $1.80 and 
$2.10 per unit, respectively. Find the minimum cost 
diet of at least 3900 calories containing at least 150 g 
of protein. 
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223 Simplex Method 

From the last section we recall the following. A linear optimization problem (linear 
programming problem) can be written in normal form; that is: 

Maximize 

(1) k. jfOO C]*i d* * ■ * 4* c n x n 

subject to the constraints 

On AT + • * ' + flln*n = b l 
a 21 x\ + • • • + a 2 tt x n = b 2 

( 2 ) 

"F * * * d - a mn x n b vl 
Xi ^ 0 (/=!,••*, n). 

For finding an optimal solution of this problem, we need to consider only the basic feasible 
solutions (defined in Sec. 22.2), but there are still so many that we have to follow a 
systematic search procedure. In 1948 G. B. Dantzig published an iterative method, called 
the simplex method, for that purpose. In this method, one proceeds stepwise from one 
basic feasible solution to another in such a way that the objective function / always 
increases its value. Let us explain this method in terms of the example in the last section. 
In its original form the problem concerned the maximization of the objective function 

z = 40a'! 4- 88*2 

subject to 2*! + 8* 2 = 60 

5* a 4- 2 x 2 = 60 
*! ^ 0 
x 2 = 0. 

Converting the first two inequalities to equations by introducing two slack variables * 3 , 
* 4 , we obtained the normal form of the problem in Example 2. Together with the objective 
function (written as an equation z — 40*! — 88* 2 = 0) this normal form is 

z ~ 40 *! - 88*2 — 0 

(3) 2*i 4- 8*2 4- *3 =60 

5*! 4-2*2 4- * 4 = 60 

where x x ^ 0, • • * , * 4 S 0. This is a linear system of equations. To find an optimal 
solution of it, we may consider its augmented matrix (see Sec. 7.3) 


A*1 *2 *3 *4 b 




nj. 

-40 

-88 

_L o 

0 

-L- 0-1 

(4) 

T 0 = 

i 

o ! 

i 

2 

8 

i 

! i 

i 

0 

! 60 
1 



_ o ! 

5 

2 

! o 

i 

1 

! 60 . 
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This matrix is called a simplex tableau or simplex table (the initial simplex table). These 
are standard names. The dashed lines and the letters 

z, x l9 * * * ? b 

are for ease in further manipulation. 

Every simplex table contains two kinds of variables By basic variables we mean 
those whose columns have only one nonzero entry. Thus a* 3 , a * 4 in (4) are basic variables 
and a* 1? .v 2 are nonbasic variables. 

Every simplex table gives a basic feasible solution. It is obtained by setting the nonbasic 
variables to zero. Thus (4) gives the basic feasible solution 

Aj = 0, a *2 = 0, a*3 = 60/1 — 60, a * 4 = 60/1 = 60, z = 0 

with a * 3 obtained from the second row and jc 4 from the third. 

The optimal solution (its location and value) is now obtained stepwise by pivoting, 
designed to take us to basic feasible solutions with higher and higher values of z until the 
maximum of z is reached. Here, the choice of the pivot equation and pivot are quite 
different from that in the Gauss elimination. The reason is that a* 1s a* 2 , jc 3 , a * 4 are restricted 
to nonnegative values. 

Step 1. Operation O x : Selection of the Column of the Pivot 

Select as the column of the pivot the first column with a negative entry in Row 1. In (4) 
this is Column 2 (because of the -40). 

Operation O z : Selection of the Row of the Pivot. Divide the right sides [60 and 60 in 
(4)] by the corresponding entries of the column just selected (60/2 = 30, 60/5 = 12). 
Take as the pivot equation the equation that gives the smallest quotient. Thus the pivot 
is 5 because 60/5 is smallest. 

Operation 0 3 . Elimination by Row Operations. This gives zeros above and below the 
pivot (as in Gauss-Jordan, Sec. 7.8). 

With the notation for row operations as introduced in Sec. 7.3, the calculations in Step 
1 give from the simplex table T 0 in (4) the following simplex table (augmented matrix), 
with the blue letters referring to the previous table. 



7 

*1 

A'2 


*3 

A* 4 

b 



r i 


-72 

1 

■l 

0 

8 ! 

480 1 

Row 1 4- 8 Row 3 

(5) T x = 

0 

i 

! o 

i 

7.2 

1 

1 

1 

| 

l 

i 

-0.4 ! 

i 

36 

Row 2 - 0.4 Row 3 


. 0 

! 5 

2 

1 

1 

0 

i ! 

60 . 



We see that basic variables are now a* 1? a * 3 and nonbasic variables are x 2 , a 4 . Setting the 
latter to zero, we obtain the basic feasible solution given by T 1? 

. Yl = 60/5 = 12 , a * 2 = 0, a * 3 = 36/1 = 36, ,v 4 = 0, z = 480. 

This is A in Fig. 473 (Sec. 22 . 2 ). We thus have moved from O: (0. 0) with z — 0 to A: 
(12, 0) with the greater z = 480. The reason for this increase is our elimination of a term 
(—40 a*!) with a negative coefficient. Hence elimination is applied only to negative entries 
in Row 1 but to no others. This motivates the selection of the column of the pivot. 
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We now motivate the selection of the row of the pivot. Had we taken the second row 
of T 0 instead (thus 2 as the pivot), we would have obtained z = 1200 (verify!), but this 
line of constant revenue z — 1200 lies entirely outside the feasibility region in Fig. 473. 
This motivates our cautious choice of the entry 5 as our pivot because it gave the smallest 
quotient (60/5 = 12). 

Step 2. The basic feasible solution given by (5) is not yet optimal because of the negative 
entry -72 in Row 1. Accordingly, we perform the operations O x to 0 3 again, choosing 
a pivot in the column of —72. 

Operation O t . Select Column 3 of T x in (5) as the column of the pivot (because -72 < 0). 
Operation 0 2 • We have 36/7.2 = 5 and 60/2 = 30. Select 7.2 as the pivot (because 5 < 30). 
Operation O z . Elimination by row operations gives 


z 

X 1 

*2 

x 3 

A'4 

b 



rij. 

0 

0 ! 

10 

4 ! 

840 1 

Row 1 + 

10 Row 2 

i 

o ! 

0 

i 

7.2 | 

1 

-0.4 i 

36 



o ! 

5 

0 ! 

1 

1 \ 

50 

Row 3 - 

2 

• Row 2 



i 

~ 3^6 

09 ! 



7.2 


We see that now .v 1? x 2 are basic and x 3 , a * 4 nonbasic. Setting the latter to zero, we obtain 
from T 2 the basic feasible solution 

x x = 50/5 = 10, a' 2 = 36/7.2 = 5, a 3 = 0, a 4 = 0, z = 840. 

This is B in Fig. 473 (Sec. 22.2). In this step, z has increased from 480 to 840, due to the 
elimination of —72 in T v Since T 2 contains no more negative entries in Row 1, we 
conclude that z = f( 10, 5) = 40 • 10 4- 88 • 5 = 840 is the maximum possible revenue. 
It is obtained if we produce twice as many S heaters as L heaters. This is the solution of 
our problem by the simplex method of linear programming. ■ 

Minimization. If we want to minimize z = /(x) (instead of maximize), we take as the 
columns of the pivots those whose entry in Row 1 is positive (instead of negative). In 
such a Column k we consider only positive entries tj k and take as pivot a tj k for which 
bj/tj k is smallest (as before). For examples, see the problem set. 



SIMPL£X METHOD 

Write in normal form and solve by the simplex method, 
assuming all x j to be nonnegative. 

1. Maximize / = 3 a*j + 2a* 2 subject to 3 a* x + 4.v 2 ^ 60, 
4 .Yj + 3.y 2 = 60, lO.Yj + 2a 2 = 120. 

2. Prob. 16 in Problem Set 22.2. 

3. Maximize the profit in the daily production of x x metal 
frames F x ($90 profit/frame) and a 2 frames F 2 ($50 
profit/frame) under the restrictions x x + 3a 2 = 1800 
(material), x x + x 2 ^ 1000 (machine hours), 
3 .v 2 + x 2 m 2400 (labor). 


4. Maximize f = 2x x + 3.y 2 + .v 2 subject to 

Xi + a 2 + a* 3 = 4 . 8 , 10a-! + a*3 = 9 . 9 , a* 2 — a* 3 = 0 . 2 . 

5. The problem in the text with the order of the constraints 
interchanged. 

6. Minimize / = 4*! — 1 0 a 2 - 20a 3 subject to 
3 a*! -1- 4a 2 -I- 5a 3 ^ 60, 2a*! 4* a 2 = 20, 

2a-! + 3x 3 ^ 30. 

7. Minimize f = 5x 1 — 20a 2 subject to —2 a*! + 10a* 2 ^ 5, 
2x x + 5 a * 2 ^ 10 . 

8. Prob. 20 in Problem Set 22.2. 
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(b) Write a program for maximizing z = 4 a 2 x 2 

in R. 

(c) Write a program for maximizing 

z = 4- • • • 4 a n x n subject to linear constraints. 

(d) Apply your programs to problems in this problem 
set and the previous one. 

Difficulties 

We recall from the last section that in the simplex method we proceed stepwise from one 
basic feasible solution to another, thereby increasing the value of the objective function 
f until we reach an optimal solution. Occasionally (but rather infrequently in practice), 
two kinds of difficulties may occur. 

The first of these is degeneracy. A degenerate feasible solution is a feasible solution 
at which more than the usual number n — m of variables are zero. Here n is the number 
of variables (slack and others) and m the number of constraints (not counting the Xj ^ 0 
conditions). In the last section, n = 4 and m = 2, and the occurring basic feasible solutions 
were nondegenerate; n — rn = 2 variables were zero in each such solution. 

In the case of a degenerate feasible solution we do an extra elimination step in which 
a basic variable that is zero for that solution becomes nonbasic (and a nonbasic variable 
becomes basic instead). We explain this in a typical case. For more complicated cases 
and techniques (rarely needed in practice) see Ref. [F5J in App. I . 

EXAMPLE 1 Simplex Method, Degenerate Feasible Solution 

AB Steel, Inc., produces two kinds of iron I x , I 2 by using three kinds of raw material R lt R 2 . R 2 (scrap iron and 
two kinds of ore) as shown. Maximize the daily profit. 


9, Maximize / = 34.Vj 4 29a 2 4 32.v 3 subject to 
8*! 4 2a 2 4 .v 3 ^ 54, 3a‘j 4 8a* 2 4 2.v 3 ^ 59, 
Ai 4 a ' 2 4 5a 3 ^ 39. 

10. CAS PROJECT. Simplex Method, (a) Write a 
program for graphing a region R in the first quadrant 
of the AjAVplane determined by linear constraints. 


22.4 Simplex Method: 


Raw 

Material 

Raw Material Needed 
per Ton 

Raw Material Available 
per Day (tons) 

Iron / x 

Iron / 2 

R i 

2 

1 

16 

R 2 

1 

1 

8 

R 3 

0 

1 

3.5 

Net profit 
per ton 

$150 

$300 



Solution . Let a*! and .v 2 denote the amount (in tons) of iron / 2 and / 2 . respectively, produced per day. Then 
our problem is as follows. Maximize 

<U c = fix) = 150a*! 4 300 a* 2 

subject to the constraints .v x ^ 0. x 2 ^ 0 and 

2a*! 4 a 2 ^ 16 (raw material R t ) 

a*! + .v 2 = 8 (raw material R 2 ) 

a* 2 ^ 3.5 (raw material /? 3 ). 
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By introducing slack variables .v 3 , *4, .v 5 we obtain the normal form of the constraints 


( 2 ) 


2a‘! 4- x 2 + .V 3 = 16 

x x + x 2 + -v 4 = 8 

x 2 + .v 5 = 3.5 

a* HO (1 = I . * * • . 5). 


As in die last section we obtain from (1) and (2) the initial simplex table 



*1 

*2 

• v 3 

A '4 

x 5 


b 

r-M- 


r 30p__j. 

_0_. 

0 

__0_ 

1 

■r 

_0_" 

0 ! 

2 

1 ! 

I 

1 

0 

0 

l 

l 

| 

16 

0 ! 

1 

1 

1 j 

0 

1 

0 

1 

1 

l 

8 

. 0 ! 

0 

1 ! 

0 

0 

1 

l 

1 

3.5. 


We see that .v x . x 2 are nonbasic variables and .v 3 . .v 4 , a* 5 are basic. With *1 — a 2 = 0 we have from (3) the basic 
feasible solution 

a*i = 0. A- 2 = 0. a 3 = 16/1 = 16. .v 4 = 8/1 = 8. .v 5 = 3.5/1 = 3.5. c = 0. 

This is O: (0. 0) in Fig. 474. We have ;/ = 5 variables Xj . m = 3 constraints, and n - m = 2 variables equal 
to zero in our solution, which thus is nondegenerate. 

Step 1 of Pivoting 

Operation O x : Column Selection of Pivot. Column 2 (since -150 < 0). 

Operation 0 2 : Row Selection of Pivot. 16/2 = 8. 8/1 = 8: 3.5/0 is not possible. Hence we could choose 
Row 2 or Row 3. We choose Row 2. The pivot is 2. 

Operation O z : Elimination by Row Operations. This gives the simplex table 


A'l A*2 A3 A‘4 .V5 b 


“ 1 

0 

-225 

1 

1 

75 

0 

““o" 

1 

1 

1200 ’ 

Row 1 75 Row 2 

0 

2 

1 

1 

1 

1 

1 

0 

0 

1 

1 

1 

16 


0 

0 

1 

5 

1 

1 

1 

“5 

1 

0 

1 

1 

1 

0 

Row 3 — | Row 2 

. 0 

0 

1 

1 

1 

0 

0 

1 

1 

1 

3.5 _ 

Row 4 


We see that the basic variables are .v 1? .v 4 . .v 5 and the nonbasic are .v 2 . .v 3 . Setting the nonbasic variables to zero, 
we obtain from the basic feasible solution 



Fig. 474. Example 1, where A is degenerate 
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. Vj = 16/2 = 8, .v 2 = 0. .v 3 = 0, .v 4 = 0/1 = 0, .v 5 = 3.5/1 = 3.5, z = 1200 

This is A: (8. 0) in Fig. 474. This solution in degenerate because .v 4 = 0 (in addition to .v 2 = 0, .v 3 = 0); 
geometrically: the straight line .V4 = 0 also passes through A. This requires the next step, in which .v 4 will become 
nonbasic. 

Step 2 of Pivoting 

Operation Oji Column Selection of Pivot. Column 3 (since —225 < 0). 

Operation 0 2 i Row Selection of Pivot. 16/1 = 16, 0/| = 0. Hence 3 must serve as the pivot. 

Operation 0 3 : Elimination by Row Operations. This gives the following simplex table. 



* 


•Vl 

• v 2 

«V 3 


- v 5 

b 



r 1 

1 

0 

0 1 

-150 

450 

0 

l 1200 1 

Row 1 4- 450 Row 3 



“ !~ 


1- 




_i 



0 

1 

| 

2 

0 I 

2 

-2 

0 

I 16 

Row 2-2 Row 3 

T 2 = 

0 

1 

1 

1 

0 

1 

I! 

1 

“2 

l 

0 

1 

f 0 



_0 

1 

1 

0 

0 I 

1 

-2 

1 

! 3 . 5 . 

Row 4-2 Row 3 


We see that the basic variables are jr lt .v 2 , ,v 5 and the nonbasic are .v 3 , .v 4 . Hence .v 4 has become nonbasic, as 
intended. By equating the nonbasic variables to zero we obtain from T 2 the basic feasible solution 

.vj = 16/2 = 8, .v 2 = 0/§ = 0, .y 3 = 0, .v 4 = 0, .v 5 = 3.5/1 = 3.5. z = 1200. 

This is still A: (8. 0) in Fig. 474 and z has not increased. But this opens the way to the maximum, which we 
reach in the next step. 

Step 3 of Pivoting 

Operation Op Column Selection of Pivot. Column 4 (since -150 < 0). 

Operation 0 2 : Row Selection of Pivot. 16/2 • 8, 0/(— 3) = 0. 3.5/1 = 3.5. We can take 1 as the pivot. (With 
as the pivot we would not leave A. Try it.) 

Operation 0 3 : Elimination by Row Operations. This gives the simplex table 


z x\ 

• v 2 

- v 3 

.V 4 

*5 

b 



p~!-2- 

_0_ 

4 -°- 

J50 

150 

l 1725 1 

-r 

Row 1 4 

150 Row 4 

0 

2 

0 

i 0 

2 

-2 

! 9 

Row 2 - 

1 Row 4 

0 

0 

1 

2 

! 0 

0 

1 

2 

j 1.75 

Row 3 4 

| Row 2 

.0 

0 

0 

i 1 

—2 

1 

j 3.5 . 




We see that basic variables are x\, .v 2 . ,v 3 and nonbasic .v 4 . .v 5 . Equating the latter to zero we obtain from T 3 the 
basic feasible solution 

A'j = 9/2 = 4.5, a * 2 = 1.75/| = 3.5, .v 3 = 3.5/1 = 3.5, .v 4 = 0, .v 5 = 0, z= 1725. 

This is B : (4.5, 3.5) in Fig. 474. Since Row l of T 3 has no negative entries, we have reached the maximum 

daily profit z max = /(4.5, 3.5) = 150 • 4.5 4 300 • 3.5 = $1725. This is obtained by using 4.5 tons of iron /| 

and 3.5 tons of iron / 2 . B 

Difficulties in Starting 

As a second kind of difficulty, it may sometimes be hard to find a basic feasible solution 
to start from. In such a case the idea of an artificial variable (or several such variables) 
is helpful. We explain this method in terms of a typical example. 
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EXAMPLE 2 


Simplex Method: Difficult Start, Artificial Variable 

Maximize 

(7) z = /(x) = 2 a*! + a * 2 

subject to the constraints a-j ^ 0, .v 2 ^ 0 and (Fig. 475) 

a*i ~ iv 2 ^ I 
a*i - * v 2 = 2 

x x + .V 2 ^ 4. 

Solution . By means of slack variables we achieve the normal form of the constraints 


2*1 - *2 

= 0 

• V 1 ~ i v 2 

- -V 3 = I 

.Vi ~ * 2 

4- a*4 = 2 

X\ + *2 

-3- 

II 

£ 

+ 

* ^ 0 

(/= 5). 


Note that the first slack variable is negative (or zero), which makes .v 3 nonnegative within the feasibility region 
(and negative outside). From (7) and (8) we obtain the simplex table 


- 

•Vi 

x 2 

*3 

a- 4 

A 5 


b 

r i 

1 -2 

-1 1 

0 

0 

0 

1 

o 1 


~\ 

1 - 




-t- 


0 

! i 

i 

-i ! 
| 

-I 

0 

0 

1 

1 

1 

1 

0 

i 

1 

-1 ] 

0 

l 

0 

1 

1 

1 

2 

_ 0 

! i 

i ! 

0 

0 

1 

1 

1 

4 . 


A j. a * 2 are nonbasic. and we would like to take a 3 . .v 4 . a * 5 as basic variables. By our usual process of equating 
the nonbasic variables to zero we obtain from this table 

a*i = 0, .v 2 « 0, .v 3 = l/(- 1) - -l t A - 4 = 2/1 = 2. .y 5 = 4/1=4, - = 0. 

,v 3 < 0 indicates that (0. 0) lies outside the feasibility region. Since .v 3 < 0, we cannot proceed immediately. 
Now, instead of searching for other basic variables, we use the following idea. Solving the second equation in 
(8) tor a 3 , we have 

a- 3 = - 1 + x, - |a- 2 . 

To this we now add a variable .v 6 on the right. 




SEC 22.4 Simplex Method: Difficulties 


951 


(9) .*3 = - 1 + -Vj - |.v 2 + .v 6 . 

x 6 is called an artificial variable and is subject to the constraint .v 6 ^ 0. 

We must take care that .v 6 (which is not part of the given problem!) will disappear eventually. We shall see 
that we can accomplish this by adding a term —Mx 6 with very large M to the objective function. Because of 
(7) and (9) (solved for .v 6 ) this gives the modified objective function for this “extended problem” 

(10) z = z- Mx 6 = 2*i + a ' 2 - Mx 6 = (2 4- A#)*, + (1 - £M)a 2 - Mx 3 - M. 

We see that the simplex table corresponding to (10) and (8) is 



z x x 

.v 2 

• v 3 

.v 4 

•*5 

• v 6 

b 


1 1 -2 - M 
_ L 

-1+1 M | 

M 

0 

0 

0 I 

l_ 

—M 


0 1 

~2 1 

-I 

0 

0 

o ! 

1 

T 0 = 

0 1 
1 

1 

-1 1 
| 

0 

1 

0 

1 

0 1 
1 

2 


0 1 

1 { 

0 

0 

1 

o | 

4 


1 

0 I 1 

-4 ! 

-1 

0 

0 

1 

1 1 

1 


The last row of this table results from (9) written as x x - |a 2 - a * 3 4- .v 6 = 1 . We see that we can now start, 
taking a* 4 . a 5 . a * 6 as the basic variables and .Vj. a 2 . a 3 as the nonbasic variables. Column 2 has a negative first 
entry. We can take the second entry (1 in Row 2) as the pivot. This gives 


- 

-*i 

• v 2 


- v 3 

• v 4 

a 5 

• v 6 


b 

I 1 

r 

_o__ 

—2 

J 

2 _ 

0 

_JL_ 

__o 

1 

* T " 

2_" 

o | 

i 

“2 

1 

1 

-1 

0 

0 

0 

1 

1 

1 

i 

0 1 
1 

0 

4 

1 

1 

| 

1 

I 

0 

0 

1 

1 

1 

1 

0 ! 

0 

3 

2 

1 

1 

1 

0 

i 

0 

1 

1 

3 

i 

L o i 

0 

0 

1 

1 

0 

0 

0 

1 

1 

1 

0 


This corresponds to x\ = 1. .v 2 = 0 (point A in Fig. 475). .v 3 = 0. .v 4 = I. a 5 = 3. .v 6 = 0. We can now drop 
Row 5 and Column 7. In this way we get rid of .v 6 , as wanted, and obtain 


*i v 2 ,v 3 -v 4 -v 5 b 




-2 

l -2 
t 

_0_ 

— °--l- 

2 ~ 

0 

! i 

1 

2 

! -i 

0 

o ! 

1 

0 

i 0 

_1 

2 

i 

1 

o 

1 

. 0 

! o 

1 

! i 

0 

i ! 

3 _ 


In Column 3 we choose 3/2 as the next pivot. We obtain 

z xi x 2 x 3 a - 4 .v 5 b 


fH 

i 0 

0 

i 

— z 

3 

0 


6 “ 

0 

! i 

o 1 

_2 

3 

0 

* ! 

2 

0 

I 0 

0 ! 

4 

3 

1 

3 ! 

2 

. 0 

i 0 

3 ! 
2 1 

1 

0 

i ! 

3 . 


This corresponds to Aj — 2, .v 2 - 2 (this is B in Fig. 475), .v 3 = 0, .v 4 = 2. .v 5 = 0. In Column 4 we choose 4/3 
as the pivot, by the usual principle. This gives 
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* 


-vj 

- v 2 

-<3 

.v 4 

*5 


b 

r , 

1 

0 

0 

1 0 

1 

2 

3 

__ 2 _ 

1 

7 1 

— 

-4- 


— 4 

-4- 

— 

0 

) 

1 

1 

1 

0 

1 0 

1 

2 

1 

2 

1 

1 

3 

0 

1 

1 

1 

0 

° ! 3 

1 

1 

3 

i 

1 

1 

2 

. 0 

1 

1 

0 

l ! o 

3 

4 

2 

4 

1 

1 

3 

2 -1 


This corresponds to x x = 3. .v 2 = 1 (point C in Fi°. 475), a 3 = |, ,v 4 = 0, .v 5 = 0. This is the maximum 
/max=/<3, 0 = 7. ■ 


ERh03EL-£M SITE 


If in a step you have a choice between pivots, take the one 
that comes first in the column considered. 

1. Maximize z = f i(x) = 6xj 4- 12 a 2 subject to 
0 ^ x x = 4, 0 ^ a* 2 ^ 4, 6a*! 4- I2a 2 ^ 72. 

2. Do Prob. 1 with the last two constraints interchanged. 

3. Maximize the daily output in producing a*! glass plates 
by a process P Y and a* 2 glass plates by a process P 2 
subject to the constraints (labor hours, machine hours, 
raw material supply) 

2a*j + 3a* 2 ^ 130, 3a*! 4- Sa* 2 ^ 300, 

4a* x + 2a* 2 S 140. 

4. Maximize z = 300a*! 4- 500.v 2 subject to 

2a*! + 8a* 2 ^ 60, 2a*! + a* 2 ^ 30, 4a*! 4- 4a 2 ^ 60. 

5. Do Prob. 4 with the last two constraints interchanged. 
Comment on the resulting simplification. 

6. Maximize die total output / = x x 4- a 2 4- a* 3 (production 
figures of three different production processes) subject 


to input constraints (limitation of machine time) 

4.\*! 4" 5.v 2 4* 8a* 3 ^ 1 2, 

8a*! 4- 5a* 2 4* 4a*3 =12. 

7. Maximize f = 6a*i 4- 6a* 2 4- 9.v 3 subject to 

Xj = 0 (y = I , • • • , 5), and a*! 4- a* 3 4* a* 4 = l, 

A* 2 4- A*3 4- Ag = 1. 

8. Using an artificial variable, minimize f = 2a*! — a* 2 
subject to .Vi ^ 0, a* 2 = 0, a*i 4- a* 2 = 5, -a*! 4- a* 2 = 1 , 
5a*! 4- 4a* 2 ^ 40. 

9. Maximize / = 4 a*i 4- .v 2 4- 2 a* 3 subject to x x ^ 0, 
A* 2 = 0, .1*3 = 0, A*! 4- A- 2 4- A*3 = 1 . A* X 4- A* 2 — -V 3 ^ 0. 

10. If one uses the method of artificial variables in a 
problem without solution, this nonexistence will 
become apparent by the fact that one cannot get rid of 
the artificial variable. Illustrate this by trying to 
maximize / = 2a* x 4- a* 2 subject to a* x ^ 0, a* 2 = 0, 
2a*! 4“ x 2 — 2, A*! 4- 2a* 2 = 6, A*! 4- A*2 = 4. 




STIONS AND PROBLEMS 


1. What is the difference between constrained and 
unconstrained optimization? 

2. State the idea and the basic formulas of the method of 
steepest descent. 

3. Write down an algorithm for the method of steepest 
descent. 

4. Design a “method of steepest ascent” for determining 
maxima. 

5. What is linear programming? Its basic idea? An 
objective function? 

6. Why can we not use methods of calculus for extrema 
in linear programming? 

7. What are slack variables? Artificial variables? Why did 
we use them? 

8. Apply the method of steepest descent to 


/(x) = a*! 1 2 3 4 5 6 7 8 4- 1.5a* 2 2 , starting from (6, 3). Do 3 steps. 
Why is the convergence faster than in Example 1, 

Sec. 22.1? 

9. What does the method of steepest descent amount to in 
the case of a single variable? 

10. In Prob. 8 start from x 0 = 1 1 .5 1 ] T . Show that the next 

even-numbered approximations are x 2 = A'X 0 , x 4 = k 2 x 0y 
etc., where k = 0.04. 

11. What happens in Example 1 of Sec. 22. 1 if you replace 
the function f(x) = a*, 2 4- 3a* 2 2 by /(x) = a* 2 4- 5a* 2 2 ? 
Do 5 steps, starting from x 0 = [6 3] T . Is the 
convergence faster or slower? 

12. Apply the method of steepest descent to 

/(x) = 9 a*! 2 4- a* 2 2 4- 18a*! — 4a* 2 , 5 steps, starting 
from x 0 = [2 4] t . 

13. In Prob. 1 2, could you start from [0 0] T and do 5 steps? 
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14. Show thal the gradients in Prob. 13 are orthogonal. Give 
a reason. 


1 15-20 Graph or sketch the region in the first quadrant 
of the .Vj.v 2 -plane determined by the following inequalities. 


15. ,V| -I- 3*2 = 6 
2 a*! -1- a* 2 = 4 
17. A*! 4 A'2 = 0 

A'l + A'2 ^ 4 

19. A*! 4 A* 2 S 5 

a-2 ^ 3 

-A*i 4 A* 2 = 2 


16. A*! - 2a 2 ^ -2 

0.8a*! 4 .v 2 = 6 

18. A*i - 2 .v 2 S -4 

2a*! -I- A' 2 ^ 12 

A'l + x 2 = 8 

20. A*i + A* 2 = 2 

2a* j - 3 a*2 ^ -12 


2 1-25 1 Maximize or minimize as indicated. 

21. Maximize / = 10 a*! 4 20a* 2 subject to a*! ^ 5, 
a*! -I- a * 2 = 6. a - 2 = 4. 

22. Maximize f = .vi -I- a* 2 subject to a*i 4- 2a* 2 = 10, 
2a*! 4a* 2 ^ IO.a-2^4. 

23. Minimize / = 2a'x — 10a* 2 subject to — x 2 = 4, 
2a*! 4 -V 2 ^ 14, A*! 4 A*2 = 9, -A*! 4 3a*2 = 15. 


24. A factory produces two kinds of gaskets, G l? G 2 , with 
net profit of $60 and $30, respectively. Maximize the 
total daily profit subject to the constraints (a j = number 
of gaskets Gj produced per day) 

40a*! 4 40a* 2 = 1800 (Machine hours), 


200a*! 4 20at 2 ^ 6300 (Labor). 


25. Maximize the daily output in producing .Vi chairs by 
a process P x and a* 2 chairs by a process P 2 subject to 
3a*! 4 4a* 2 ^ 550 (machine hours), 5a*x 4 4a* 2 ^ 650 
(labor). 
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Unconstrained Optimization. Linear Programming 


In optimization problems we maximize or minimize an objective function z — /(x) 
depending on control variables a* 1t , x m whose domain is either unrestricted 
(“unconstrained optimization,” Sec. 22.1) or restricted by constraints in the form 
of inequalities or equations or both (“constrained optimization,” Sec. 22.2). 

If the objective function is linear and the constraints are linear inequalities in 
a*i, • • • , x m , then by introducing slack variables jr w+1 , • • • , x n we can write the 
optimization problem in normal form with the objective function given by 

(I) /l = CiA'i + • • • + C n X n 

(where c„ l+1 = * • * = c n = 0) and the constraints given by 

«11*1 + 012*2 + • • • + fl ltr V n = b x 


( 2 ) 


0?iil*l “b 0 jh2-*2 “b " * " "b a mn x n 

*i = 0, • • • , x n S 0. 


In this case we can then apply the widely used simplex method (Sec. 22.3), a 
systematic stepwise search through a very much reduced subset of all feasible 
solutions. Section 22.4 shows how to overcome difficulties with this method. 






CHAPTER 2 3 
Graphs. 

Combinatorial Optimization 


Graphs and digraphs (= directed graphs) have developed into powerful tools in areas, 
such as electrical and civil engineering, communication networks, operations research, 
computer science, economics, industrial management, and marketing. An essential factor 
of this growth is the use of computers in large-scale optimization problems that can be 
modeled by graphs and solved by algorithms provided by graph theory. This approach 
yields models of general applicability and economic importance. It lies in the center of 
combinatorial optimization, a term denoting optimization problems that are of 
pronounced discrete or combinatorial structure. 

This chapter gives an introduction to this wide area, which constitutes a shift of emphasis 
away from differential equations, eigenvalues, and so on, and is full of new ideas as well 
as open problems — in connection, for instance, with efficient computer algorithms. The 
classes of problems we shall consider include transportation of minimum cost or time, 
best assignment of workers to jobs, most efficient use of communication networks, and 
many others. Problems for these classes often form the core of larger and more involved 
practical problems. 

Prerequisite: none. 

References and Answers to Problems: App. 1 Part F, App. 2. 


23.1 Graphs and Digraphs 

Roughly, a graph consists of points, called vertices, and lines connecting them, called 
edges. For example, these may be four cities and five highways connecting them, as in 
Fig. 476. Or the points may represent some people, and we connect by an edge those who 
do business with each other. Or the vertices may represent computers in a network and 
the edges connections between them. Let us now give a formal definition. 
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Fig. 476. Graph consisting of 
4 vertices and 5 edges 


Double edge 

Fig. 477. Isolated vertex, loop, double 
edge. (Excluded by definition.) 
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DEFINITION 


Graph 

A graph G consists of two finite sets (sets having finitely many elements), a set V 
of points, called vertices, and a set E of connecting lines, called edges, such dial 
each edge connects two vertices, called the endpoints of the edge. We write 

G = (V, £). 

Excluded are isolated vertices (vertices that are not endpoints of any edge), loops 
(edges whose endpoints coincide), and multiple edges (edges that have both 
endpoints in common. See Fig. 477. 


CAUTION! Our three exclusions are practical and widely accepted, but not uniformly. 
For instance, some authors permit multiple edges and call graphs without them simple 
graphs . ■ 

We denote vertices by letters, w, u, • • • or u 1? y 2 , * * * or simply by numbers 1,2,**' 
(as in Fig. 476). We denote edges by e x , e 2 , * * 4 or by their two endpoints; for instance, 
= (1, 4), e 2 = (1, 2) in Fig. 476. 

An edge (u i5 Uj) is called incident with the vertex v, (and conversely); similarly, 
(Vi, Vj) is incident with Uj. The number of edges incident with a vertex v is called the 
degree of u. Two vertices are called adjacent in G if they are connected by an edge in 
G (that is, if they are the two endpoints of some edge in G). 

We meet graphs in different Fields under different names: as “networks” in electrical 
engineering, “structures” in civil engineering, “molecular structures” in chemistry, 
“organizational structures” in economics, “sociograms,” “road maps,” “telecommunication 
networks,” and so on. 

Digraphs (Directed Graphs) 

Nets of one-way streets, pipeline networks, sequences of jobs in construction work, flows 
of computation in a computer, producer-consumer relations, and many other applications 
suggest the idea of a “digraph” (= directed graph), in which each edge has a direction 
(indicated by an arrow, as in Fig. 478). 



DEFINITION 


Digraph (Directed Graph) 

A digraph G = (V, E) is a graph in which each edge e = (ij) has a direction from 
its “initial point ” i to its “terminal point” j. 


Two edges connecting the same two points L j are now permitted, provided they have 
opposite directions, that is, they are (ij) and (j, /). Example . (1, 4) and (4, I) in Fig. 478. 
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A subgraph or subdigraph of a given graph or digraph G = (V, E), respectively, is a 
graph or digraph obtained by deleting some of the edges and vertices of G, retaining the 
other edges of G (together with their pairs of endpoints). For instance, e l9 e 3 (together 
with the vertices 1, 2, 4) form a subgraph in Fig. 476, and e 3 , <? 4 , e 5 (together with the 
vertices 1, 3, 4) form a subdigraph in Fig. 478. 

Computer Representation of Graphs and Digraphs 

Drawings of graphs are useful to people in explaining or illustrating specific situations. 
Here one should be aware that a graph may be sketched in various ways; see Fig. 479. 
For handling graphs and digraphs in computers, one uses matrices or lists as appropriate 
data structures, as follows. 



(a) (6) (c) 

Fig. 479. Different sketches of the same graph 


Adjacency Matrix of a Graph G: Matrix A = [a^] with entries 

1 if G has an edge (/, y), 

& ij 

.0 else. 

Thus cifj = 1 if and only if two vertices / and y are adjacent in G. Here, by definition, no 
vertex is considered to be adjacent to itself; thus, an = 0. A is symmetric, a y = (Why?) 

The adjacency matrix of a graph is generally much smaller than the so-called incidence 
matrix (see Probs. 21, 22) and is preferred over the latter if one decides to store a graph 
in a computer in matrix form. 

EXAMPLE 1 Adjacency Matrix of a Graph 

Vertex 12 3 4 

Vertex 1 [ 0 I 0 l“ 

2 10 11 

3 0 10 1 

4 L I l 1 oj ■ 

Adjacency Matrix of a Digraph G: Matrix A = [ay] with entries 

1 if G has a directed edge (/, y), 

.0 else. 



This matrix A is not symmetric. (Why?) 
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EXAMPLE 2 


EXAMPLE 3 


Adjacency Matrix of a Digraph 



Lists. The vertex incidence list of a graph shows for each vertex the incident edges. 
The edge incidence list shows for each edge its two endpoints. Similarly for a digraph; 
in the vertex list, outgoing edges then get a minus sign, and in the edge list we now have 
ordered pairs of vertices. 


Vertex Incidence List and Edge Incidence List of a Graph 

This graph is Che same as in Example 1, except for notation. 



Vertex 

Incident Edges 

Edge 

Endpoints 



* 1 

Vv v 2 

v 2 

e h e 2> e 3 

*2 

v 2 , v 3 



*3 

V* ”4 

l> 4 

^ 3 ’ ^ 4 * 65 

*4 

*>4 



*5 

Vu V 4 


“Sparse graphs” are graphs with few edges (far fewer than the maximum possible number 
n(n — l)/2, where n is the number of vertices). For these graphs, matrices are not efficient. 
Lists then have the advantage of requiring much less storage and being easier to handle; 
they can be ordered, sorted, or manipulated in various other ways directly within the 
computer. For instance, in tracing a “walk” (a connected sequence of edges with pairwise 
common endpoints), one can easily go back and forth between the two lists just discussed, 
instead of scanning a large column of a matrix for a single 1. 

Computer science has developed more refined lists, which, in addition to the actual 
content, contain “pointers” indicating the preceding item or the next item to be scanned 
or both items (in the case of a “walk”: the preceding edge or the subsequent one). For 
details, see Refs. [El 6] and [F7]. 

This section was devoted to basic concepts and notations needed throughout this chapter, 
in which we shall discuss some of the most important classes of combinatorial optimization 
problems. This will at the same time help us to become more and more familiar with 
graphs and digraphs. 
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1. Sketch the graph consisting of the vertices and edges 
of a square. Of a tetrahedron. 

2. Worker can do jobs J l and J 2 , worker W 2 job 7 4 , 
worker W 3 jobs J 2 and J 3 . Represent this by a graph. 

3. Explain how the following may be regarded as graphs 
or digraphs: flight connections between given cities; 
memberships of some persons in some committees; 
relations between chapters of a book: a tennis 
tournament; a family tree. 

4. How would you represent a net of one-way and two-way 
streets by a digraph? 

5. Give further examples of situations that could be 
represented by a graph or digraph. 

6. Find the adjacency matrix of the graph in Fig. 476. 

7. When will the adjacency matrix of a graph be 
symmetric? Of a digraph? 


8-13 


ADJACENCY MATRIX 


Find the adjacency matrix of the graph or digraph. 



13 . 



Sketch the graph whose adjacency matrix is: 

r o i i n 



i 

i 


0 

1 

1 


15. 


1 

0 


Lo 


16 . 


0 

1 

1 


1 

0 

0 

0 

1 

0 

0 


1 

0 

1 

0 

0 

0 

1 

1 

0 

0 


1 

1 

0. 

0” 

0 

1 

0„ 

r 

1 

I 


L 1 1 1 Oj 


Sketch the digraph whose adjacency matrix is: 

17. The matrix in Prob. 14. 

18. The matrix in Prob. 16. 

19. (Complete graph) Show that a graph G with n vertices 
can have at most n(n — l)/2 edges, and G has exactly 
n(n — I )/2 edges if G is complete, that is, if every pair 
of vertices of G is joined by an edge. (Recall that loops 
and multiple edges are excluded.) 

20. In what case are all the off-diagonal entries of the 
adjacency matrix of a graph G equal to 1? 


Incidence Matrix of a Graph: Matrix B = [bj k ] with 
entries 


b ik = 


C 


if vertex j is an endpoint of edge e k 
otherwise. 


Find the incidence matrix of: 


21. The graph in Prob. 9. 

22. The graph in Prob. 8. 

Incidence Matrix of a Digraph: Matrix B = (b jk ] with 
entries 



— 1 if edge e k leaves vertex j 
1 if edge e k enters vertex j 

, 0 otherwise. 


Find the incidence matrix of: 

23. The digraph in Prob. 11. 

24. The digraph in Prob. 13. 

25. Make a vertex incidence list of the digraph in Prob. 13. 
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23.2 Shortest Path Problems. Complexity 

Beginning in this section, we shall discuss some of the most important classes of 
optimization problems that concern graphs and digraphs as they arise in applications. Basic 
ideas and algorithms will be explained and illustrated by small graphs, but you should 
keep in mind that real-life problems may often involve many thousands or even millions 
of vertices and edges (think of telephone networks, worldwide air travel, companies that 
have offices and stores in all larger cities). Then reliable and efficient systematic methods 
are an absolute necessity — solution by inspection or by trial and error would no longer 
work, even if ‘‘nearly optimal” solutions are acceptable. 

We begin with shortest path problems, as they arise, for instance, in designing shortest 
(or least expensive, or fastest) routes for a traveling salesman, for a cargo ship, etc. Let 
us first explain what we mean by a path. 

In a graph G = (V, E) we can walk from a vertex v 1 along some edges to some other 
vertex v k . Here we can 

(A) make no restrictions, or 

(B) require that each edge of G be traversed at most once, or 

(C) require that each vertex be visited at most once. 

In case (A) we call this a walk. Thus a walk from to u k is of the form 
(1) (^ 1 , v 2 )> ( v 2 , y 3 )> # . (Ofc-l, v k ), 

where some of these edges or vertices may be the same. In case (B), where each edge 
may occur at most once, we call the walk a trail. Finally, in case (C), where each vertex 
may occur at most once (and thus each edge automatically occurs at most once), we call 
the trail a path. 

We admit that a walk, trail, or path may end at the vertex it started from, in which case 
we call it closed; then v k = v 1 in (1). 

A closed path is called a cycle. A cycle has at least three edges (because we do not 
have double edges; see Sec. 23.1). Figure 480 illustrates all these concepts. 



Fig. 480. Walk, trail, path, cycle 

1 - 2 — 3 — 2 is a walk (not a trail). 

4-1-2-3-4-5isa trail (not a path). 

1— 2 — 3 — 4 — 5 is a path (not a cycle). 

1 — 2 — 3 — 4 — 1 is a cycle. 

Shortest Path 

To define the concept of a shortest path, we assume that G = (V, E) is a weighted graph, 
that is, each edge (u ; , Vj) in G has a given weight or length l y > 0. Then a shortest path 
v i v k (with fixed Oj and v k ) is a path (1) such that the sum of the lengths of its edges 

*12 ^23 ^34 + ■ ■ ■ + /fc-l,fc 

(hz = length of (v 1 , v 2 ). etc.) is minimum (as small as possible among all paths from 
t'l to Vk). Similarly, a longest path u, — > v k is one for which that sum is maximum. 
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Shortest (and longest) path problems are among the most important optimization problems. 
Here, “length” 1# (often also called “cost” or “weight”) can be an actual length measured 
in miles or travel time or gasoline expenses, but it may also be something entirely different. 

For instance, the “ traveling salesman problem ” requires the determination of a shortest 
Hamiltonian 1 cycle in a graph, that is, a cycle that contains all the vertices of the graph. 

As another example, by choosing the “most profitable” route v 1 a salesman may 
want to maximize 2/^*, where 1^ is his expected commission minus his travel expenses 
for going from town i to town j. 

In an investment problem, i may be the day an investment is made,y the day it matures, 
and l-ij the resulting profit, and one gets a graph by considering the various possibilities 
of investing and reinvesting over a given period of time. 

Shortest Path if All Edges Have Length / = 1 

Obviously, if all edges have length 1, then a shortest path — > v k is one that has the 

smallest number of edges among all paths — » v k in a given graph C. For this problem 

we discuss a BFS algorithm. BFS stands for Breadth First Search. This means that in 
each step the algorithm visits all neighboring (all adjacent) vertices of a vertex reached, 
as opposed to a DFS algorithm (Depth First Search algorithm), which makes a long trail 
(as in a maze). This widely used BFS algorithm is shown in Table 23.1. 

We want to find a shortest path in G from a vertex s {start) to a vertex t (terminal). To 
guarantee that there is a path from s to /, we make sure diat G does not consist of separate 
portions. Thus we assume that G is connected, that is, for any two vertices v and w there 
is a path v — > w in G. (Recall that a vertex v is called adjacent to a vertex u if there is 
an edge («, v) in G.) 

Table 23.1 Moore’s BFS for Shortest Path (All Lengths One) 

Proceedings of the International Symposium for Switching Theory. Pari II. pp. 285-292. Cambridge: Harvard 
Universiiy Press, 1959. 

ALGORITHM MOORE [G = (V, £), 5 , t] 

This algorithm determines a shortest path in a connected graph G = ( V, E) from a vertex 
s to a vertex t. 

INPUT: Connected graph G = (V, £), in which one vertex is denoted by s and 

one by t , and each edge (i,y) has length = 1. Initially all vertices are 
unlabeled. 

OUTPUT: A shortest path s — » / in G = (V, E) 

1. Label s with 0. 

2. Set / = 0. 

3. Find all unlabeled vertices adjacent to a vertex labeled /. 

4. Label the vertices just found with / -M. 

5. If vertex t is labeled, then “backtracking” gives the shortest path 

k (= label of r), k - 1, k - 2, • • • , 0 

OUTPUT *, k - 1, k - 2, ■ • • , 0. Stop 
Else increase i by 1 . Go to Step 3. 

End MOORE 


1 WILLIAM ROWAN HAMILTON ( 1805-1865), Irish mathematician, known for his work in dynamics. 
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EXAMPLE 1 Application of Moore’s BFS Algorithm 

Find a shortest path $ — ► / in the graph G shown in Fig. 481. 

Solution . Figure 481 shows the labels. The blue edges form a shortest path (length 4). There is another 
shortest path s — * r. (Can you find it?) Hence in the program we must introduce a rule that makes backtracking 
unique because otherwise the computer would not know what to do next if at some step there is a choice (for 
instance, in Fig. 481 when it got back to the vertex labeled 2). The following rule seems to be natural. 

Backtracking rule . Using the numbering of the vertices from I to n (not the labeling!), at each step, if a 
vertex labeled / is reached, take as the next vertex that with the smallest number (not label!) among all the 
vertices labeled / — 1. ■ 


2 



Fig. 481. Example 1, given graph and result of labeling 


Complexity of an Algorithm 

Complexity of Moore's algorithm. To find the vertices to be labeled 1, we have to scan 
all edges incident with s. Next, when / = 1 , we have to scan all edges incident with vertices 
labeled 1 , etc. Hence each edge is scanned twice. These are 2m operations (in = number 
of edges of G). This is a function c(m). Whether it is 2m or 5m + 3 or 12m is not so essential; 
it is essential that c(m) is proportional to m (not m 2 , for example); it is of the “order” m. 
We write for any function am + b simply 0(m), for any function am 2 + bm 4- d simply 
0(m 2 ), and so on; here, O suggests order. The underlying idea and practical aspect are 
as follows. 

In judging an algorithm, we are mostly interested in its behavior for very large problems 
(large m in the present case), since these are going to determine the limits of the 
applicability of the algorithm. Thus, the essential item is the fastest growing term (am 2 
in am 2 + bm + d y etc.) since it will overwhelm the others when m is large enough. Also, 
a constant factor in this term is not very essential; for instance, the difference between 
two algorithms of orders, say, 5 w 2 and 8m 2 is generally not very essential and can be 
made irrelevant by a modest increase in the speed of computers. However, it does make 
a great practical difference whether an algorithm is of order m or m 2 or of a still higher 
power m p . And the biggest difference occurs between these “polynomial orders” and 
“exponential orders,” such as 2 m . 

For instance, on a computer that does 10 9 operations per second, a problem of size 
m = 50 will take 0.3 second with an algorithm that requires m 5 operations, but 13 days 
with an algorithm that requires 2™ operations. But this is not our only reason for regarding 
polynomial orders as good and exponential orders as bad. Another reason is the gain in 
using a faster computer . For example let two algorithms be 0(m) and 0(m 2 ). Then, since 
1000 = 31.6 2 , an increase in speed by a factor 1000 has the effect that per hour we can 
do problems 1000 and 31.6 times as big, respectively. But since 1000 = 2 9 * 97 , with an 
algorithm that is 0(2 m ), all we gain is a relatively modest increase of 10 in problem size 
because 2 997 • 2 m = 2 m+M7 . 
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The symbol O is quite practical and commonly used whenever the order of growth is 
essential, but not the specific form of a function. Thus if a function g{m) is of the form 

g(m) = kh{m) + more slowly growing terms ( k £ 0, constant), 

we say that g{m) is of the order h{m) and write 

g(m) = 

For instance, 

am + b= 0{m\ am 2 + bm + d = 0{m\ 5 • 2 W + 3 m 2 = 0{2 m ). 

We want an algorithm si to be “efficient,” that is, “good” with respect to 

(i) Time (number c^(m) of computer operations), or 
(il) Space (storage needed in the internal memory) 

or both. Here c ^ suggests “complexity” of si. Two popular choices for c are 

{Worst case) c wl (m) = longest time si takes for a problem of size m, 

{Average case) c % J{m) = average time si takes for a problem of size m. 

In problems on graphs, the “size” will often be m (number of edges) or n (number of 
vertices). For our present simple algorithm, c\, L (m) = 2m in both cases. 

For a “good” algorithm si, we want that c M {m) does not grow too fast. Accordingly, 
we call si- efficient if c^{m) = 0{m k ) for some integer k ^ 0; that is, may contain 
only powers of m (or functions that grow even more slowly, such as In m), but no 
exponential functions. Furthermore, we call si polynomially bounded if si is efficient 
when we choose the “worst case” c si {m). These conventional concepts have intuitive 
appeal, as our discussion shows. 

Complexity should be investigated for every algorithm, so that one can also compare 
different algorithms for the same task. This may often exceed the level in this chapter; 
accordingly, we shall confine ourselves to a few occasional comments in this direction. 




[mT| shortest path 

Find a shortest path P: s — > t and its length by Moore’s 
BFS algorithm; sketch the graph with the labels and indicate 
P by heavier lines (as in Fig. 481). 



5. s t 6. 

7. (Nonuniqueness) A shortest path .v— » t for given $ and 
t need not be unique. Illustrate this by Finding another 
shortest path s— > t in Example 1 in the text. 

8. (Maximum length) If P is a shortest path between any 
two vertices in a graph with n vertices, how many edges 
can P at most have? In a complete graph (with all edges 
of length I )? Give a reason. 

9* (Moore’s algorithm) Show that if a vertex v has label 
A(u) = k , then there is a path s—>voi length k. 
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16. Find 4 different closed Euler trails in Fig. 484. 


Fig. 484. Problem 16 

17. The postman problem is the problem of finding a 
closed walk W: s — » s (s the post office) in a graph G 
with edges (/. j) of length 1^ > 0 such that every edge 
of G is traversed at least once and the length of W is 
minimum. Find a solution for the graph in Fig. 483 by 
inspection. (The problem is also called the Chinese 
postman problem since it was published in the journal 
Chinese Mathematics 1 (1962), 273-277.) 

18. Show that the length of a shortest postman trail is the 
same for every starting vertex. 

19. (Order) Show that 0(m 3 ) + 0(m 3 ) = 0(m 3 ) and 
kO(m p ) = 0(m p l 

20. Show that V I + m 2 = 0(m), 0.02e m + 100m 2 = 0(e m ). 

21. If we switch from one computer to another that is 100 
times as fast, what is our gain in problem size per hour 
in the use of an algorithm that is 0(m). 0(m 2 ), 0(m 5 ). 
0(e m )? 

22. CAS PROBLEM. Moore’s Algorithm. Write a 
computer program for the algorithm in Table 23. 1 . Test 
the program with the graph in Example 1. Apply it to 
Probs. 1-3 and to some graphs of your own choice. 


23.3 Bellman's Principle. Dijkstra's Algorithm 

We continue our discussion of the shortest path problem in a graph G. The last section 
concerned the special case that all edges had length 1. But in most applications the edges 
(/, j) will have any lengths /^- > 0, and we now turn to this general case, which is of 
greater practical importance. We write = sc for any edge (/, j) that does not exist in G 
(setting oo 4- a = for any number a y as usual). 

We consider the problem of finding shortest paths from a given vertex, denoted by 1 
and called the origin, to all other vertices 2, 3, • • • , n of G. We let Lj denote the length 
of a shortest path Pf 1 — ^ j in G. 

THEOREM 1 Bellman’s Minimality Principle or Optimality Principle 2 

If Pj.‘ 1 — » j is a shortest path from 1 to j in G and (/, j) is the last edge of Pj 

(Fig. 485), then /V 1 / [obtained by dropping (/, j) from Pf\ is a shortest path 


2 RICHARD BELLMAN ( 1 920-1 984), American mathematician, known for his work in dynamic programming. 



10. Call the length of a shortest path s — » v the distance 
of v from s. Show that if v has distance /, it has label 
A(u) = /. 

11. (Hamiltonian cycle) Find and sketch a Hamiltonian 
cycle in the graph of Prob. 3. 

12. Find and sketch a Hamiltonian cycle in the graph of a 
dodecahedron, which has 12 pentagonal faces and 
20 vertices (Fig. 482). This is a problem Hamilton 
himself considered. 



Fig. 482. Problem 12 


13. Find and sketch a Hamiltonian cycle in Fig. 479, 
Sec. 23.1. 

14. (Euler graph) An Euler graph G is a graph that has a 
closed Euler trail. An Euler trail is a trail that contains 
every edge of G exactly once. Which subgraph with 
four edges of the graph in Example 1, Sec. 23.1, is an 
Euler graph? 

15. Is the graph in Fig. 483 an Euler graph? (Give a reason.) 



Fig. 483. Problems 15, 17 
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P 

i 



Fig. 485. Paths P and P f in Bellman’s minimality principle 


PROOF Suppose that the conclusion is false. Then there is a path P '**: 1 — > / that is shorter than 
Pi. Hence if we now add (/, j) to P * 9 we get a path 1 — » j that is shorter than Pj. This 
contradicts our assumption that Pj is shortest. ■ 

From Bellman’s principle we can derive basic equations as follows. For fixed j we may 
obtain various paths 1 — » j by taking shortest paths Pi for various / for which there is in 
G an edge (/,./), and add (/,./) to the corresponding P j. These paths obviously have lengths 
L t 4- l t j (Lj = length of P t ). We can now take the minimum over /, tliat is, pick an / for 
which Li + ly is smallest. By the Bellman principle, this gives a shortest path 1 — > j. It 
has the length 


( 1 ) 


L x = 0 

1*3 ^11 n h.j)y 


These are die Bellman equations. Since I H = 0 by definition, instead of min^ we can 
simply write min/. These equations suggest the idea of one of the best-known algorithms 
for the shortest path problem, as follows. 

Dijkstra's Algorithm for Shortest Paths 

Dijkstra’s 3 algorithm is shown in Table 23.2, where a connected graph G is a graph in 
which for any two vertices u and w in G there is a path v — » w. The algorithm is a labeling 
procedure. At each stage of the computation, each vertex v gets a label, either 

(PL) a permanent label = length L v of a shortest path 1 — > v 


or 

(TL) a temporary label = upper bound L v for the length of a shortest path 1 — » u . 

We denote by 2Pi£ and the sets of vertices with a permanent label and with a temporary 
label, respectively. The algorithm has an initial step in which vertex 1 gets the permanent 
label Lj = 0 and the other vertices get temporary labels, and then die algorithm alternates 
between Steps 2 and 3. In Step 2 the idea is to pick k “minimally.” In Step 3 the idea is 
that the upper bounds will in general improve (decrease) and must be updated accordingly. 
Namely, the new temporary label Lj of vertex j will be the old one if there is no 
improvement or it will be L k + Ikj if there is. 


3 ED$GER WYBE DIJKSTRA (1930-2002), Dutch computer scientist, 1972 recipient of the ACM Turing 
Award. His algorithm appeared in Numerische Mathematik 1 (1959), 269-271. 
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EXAMPLE 1 


Table 23.2 Dijkstra’s Algorithm for Shortest Paths 

ALGORITHM DIJKSTRA [G = (V, E), V = {1, • • • , n} 9 for all (/, j) in E] 

Given a connected graph G = ( V , E) with vertices 1, • • • , n and edges (/, j) having 
lengths lij > 0, this algorithm determines the lengths of shortest paths from vertex 1 to 
the vertices 2, • • • , n. 

INPUT: Number of vertices n, edges (1,7), and lengths ly 
OUTPUT: Lengths Lj of shortest paths 1 ->j 9 j — 2, • • • , n 

1. Initial step 

Vertex 1 gets PL: L 1 = 0. 

Vertex /* (= 2, • • • , n) gets TL: Lj = l X j (= if there is no edge (1 ,j) in G). 
Set3>i= { 1 },^= (2,3 , •••, 72 }. 

2. Fixing a permanent label 

Find a k in for which L k is miminum, set L k = L k . Take the smallest k if 
there are several. Delete k from and include it in 9\££. 

If ST££ = 0 (that is, S T2E is empty) then 

OUTPUT L 2 , ■ • • , L n . Stop 

Else continue (that is, go to Step 3). 

3. Updating temporary labels 

For all j in 2Ti£, set Lj = min* (Lj, L k + l kj } (that is, take the smaller of Lj and 
L k + l kj as your new Lj). 

Go to Step 2. 

End DIJKSTRA 


Application of Dijkstra’s Algorithm 

Applying Dijkstra’s algorithm to the graph in Fig. 486a, find shortest paths from vertex 1 to vertices 2, 3, 4. 
Solution, We list the steps and computations. 


1. 

Li = 0, £ 2 = 8 , Z 3 = 5, £4 = 7, 

= 1 1 1 , 

STS = (2, 3, 4) 

2. 

Dj = min {£ 2 . Z 3 , £ 4 } = 5, k = 3, 

VX= [1.3}. 

9 r ‘£= { 2 . 4| 

3. 

£2 = min { 8 , I 3 + / 32 ] = min { 8 , 5 + 1 } = 6 
£4 = min {7, L 3 *1- / 34 } = min {7, <»} =7 



2. 

£.2 — min {£ 2 . £ 4 } = min { 6 , 7) = 6 , k = 2, 

(1,2,3), 

II 

8 

3. 

£4 = min {7, + ^ 4 } “ min {7, 6 H- 2} =7 



2 . 

I 4 - 7, * = 4 

= { 1, 2, 3, 4}. 

g-<e = 0 . 

Figure 486b shows the resulting shortest paths, of lengths L 2 

= 6 . Lj = 5. L 4 = 7. 

■ 






^^4) (3^ 




(a) Given graph G 

(b) Shortest paths in G 



Fig. 486. Example 1 


Complexity. Dijkstra 's algorithm is 0(n 2 ). 
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PROOF Step 2 requires comparison of elements, first n — 2, the next time n — 3, etc., a total 
of (n - 2 )(n — l)/2. Step 3 requires the same number of comparisons, a total of 
{n — 2)0 1 — l)/2, as well as additions, first n — 2, the next time n — 3, etc., again a total of 
0? — 2)(n — l)/2. Hence the total number of operations is 3(n — 2)0 1 — l)/2 = 0(n 2 ). ■ 


! PROBLEM SE T 23.3" 


1. The net of roads in Fig. 487 connecting four villages 
is to be reduced to minimum length, but so that one 
can still reach every village from every other village. 
Which of the roads should be retained? Find the 
solution (a) by inspection, (b) by Dijksira’s 
algorithm. 



Fig. 487. Problem 1 


2-7 


DIJKSTRA'S ALGORITHM 


Find shortest paths for the following graphs. 







8. Show that in Dijkstra’s algorithm, for L k there is a path 
P: 1 — * k of length L k . 

9. Show that in Dijkstra’s algorithm, at each instant the 
demand on storage is light (data for less than n edges). 

10. CAS PROBLEM. Dijkstra’s Algorithm. Write a 
program and apply it to Probs. 2-4. 


23.4 Shortest Spanning Trees: 

Greedy Algorithm 

So far we have discussed shortest path problems. We now turn to a particularly important 
kind of graph, called a tree, along with related optimization problems that arise quite often 
in practice. 

By definition, a tree T is a graph that is connected and has no cycles. “Connected” 
was defined in Sec. 23.3; it means that there is a path from any vertex in T to any other 
vertex in T. A cycle is a path s — » / of at least three edges that is closed (/ = s); see also 
Sec. 23.2. Figure 488a shows an example. 

The terminology varies; cycles are sometimes also called circuits. 
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A spanning tree T in a given connected graph G = (V, £) is a tree containing all the 
n vertices of G . See Fig. 488b. Such a tree has n — 1 edges. (Proof?) 

A shortest spanning tree Tina connected graph G (whose edges (/, j) have lengths 
lij > 0) is a spanning tree for which S/.^ (sum over all edges of T) is minimum compared 
to 2 for any other spanning tree in G. 

Trees are among the most important types of graphs, and they occur in various 
applications. Familiar examples are family trees and organization charts. Trees can be used 
to exhibit organize, or analyze electrical networks, producer-consumer and other business 
relations, information in database systems, syntactic structure of computer programs, etc. 
We mention a few specific applications that need no lengthy additional explanations. 

The set of shortest paths from vertex 1 to the vertices 2, ••*,/? in the last section forms 
a spanning tree. 

Railway lines connecting a number of cities (the vertices) can be set up in the form of 
a spanning tree, the “length” of a line (edge) being the construction cost, and one wants 
to minimize the total construction cost. Similarly for bus lines, where “length” may be 
the average annual operating cost. Or for steamship lines (freight lines), where “length” 
may be profit and the goal is the maximization of total profit. Or in a network of telephone 
lines between some cities, a shortest spanning tree may simply represent a selection of 
lines that connect all the cities at minimal cost. In addition to these examples we could 
mention others from distribution networks, and so on. 

We shall now discuss a simple algorithm for the problem of finding a shortest spanning 
tree. This algorithm (Table 23.3) is particularly suitable for sparse graphs (graphs with 
very few edges; see Sec. 23.1). 

Table 23.3 Kruskal’s Greedy Algorithm for Shortest Spanning Trees 

Proceedings of die American Mathematical Society 7 (1956), 48-50. 

ALGORITHM KRUSKAL [G = (V, £), ly for all (ij) in £] 

Given a connected graph G = ( V, £) with edges (ij) having length > 0, the algorithm 

determines a shortest spanning tree T in G. 

INPUT: Edges (/, j) of G and their lengths /y 

OUTPUT: Shortest spanning tree T in G 

1. Order the edges of G in ascending order of length. 

2. Choose them in this order as edges of T, rejecting an edge only if it forms a 
cycle with edges already chosen. 

If n - 1 edges have been chosen, then 

OUTPUT T (= the set of edges chosen). Stop 

End KRUSKAL 



(a) A cycle (b) A spanning tree 

Fig. 488. Example of (a) a cycle, (b) a spanning tree in a graph 
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EXAMPLE 1 


Application of Kruskal’s Algorithm 

Using KruskaPs algorithm, we shall determine a shortest spanning tree in the graph in Fig. 489. 



Fig. 489. Graph in Example 1 


Solution . See Table 23.4. In some of the intermediate stages the edges chosen form a disconnected graph 
(see Fig. 490); this is typical. We stop after /i — 1=5 choices since a spanning tree has // - 1 edges. In our 
problem the edges chosen are in the upper part of the list. This is typical of problems of any size; in general, 
edges farther down in the list have a smaller chance of being chosen. I 


Table 23.4 Solution in Example 1 


Edge 

Length 

Choice 

(3, 6) 

1 

1st 

(1,2) 

2 

2nd 

(1, 3) 

4 

3rd 

(4, 5) 

6 

4th 

(2, 3) 

7 

Reject 

(3, 4) 

8 

5th 

(5, 6) 

9 


(2,4) 

11 



The efficiency of KruskaTs method is greatly increased by 

Double Labeling of Vertices. Each vertex i carries a double label (/*, pj), where 

r t = Root of the subtree to which i belongs , 

Pi — Predecessor of i In its subtree , 

Pi = 0 for roots. 

This simplifies 

Rejecting. If (i\j) is next in the list to be considered , reject ( i,j ) if r f = rj (that is, i and 
j are in the same subtree, so that they are already joined by edges and (/, j) would thus 
create a cycle). lfr t ^= /j-, include (i, j) in T. 

If there are several choices for r h choose the smallest. If subtrees merge (become a 
single tree), retain the smallest root as the root of the new subtree. 

For Example l the double-label list is shown in Table 23.5. In storing it, at each instant 
one may retain only the latest double label. We show all double labels in order to exhibit 
the process in all its stages. Labels that remain unchanged are not listed again. Underscored 
are the two 1 ’s that are the common root of vertices 2 and 3, the reason for rejecting the 
edge (2, 3). By reading for each vertex the latest label we can read from this list that 1 is 
the vertex we have chosen as a root and the tree is as shown in the last part of Fig. 490. 
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6 

First 


1 2 

@ a 


\ \ 


\ / 4 3 *v^' 


Second Third Fourth 

Fig. 490. Choice process in Example 1 


Fifth 


This is made possible by the predecessor label that each vertex carries. Also, for accepting 
or rejecting an edge we have to make only one comparison (the roots of the two endpoints 
of the edge). 

Ordering is the more expensive part of the algorithm. It is a standard process in data 
processing for which various methods have been suggested (see Sorting in Ref. [E25] 
listed in App. 1 ). For a complete list of m edges, an algorithm would be 0(m log 2 m\ 
but since the n — 1 edges of the tree are most likely to be found earlier, by inspecting 
the q (< m) topmost edges, for such a list of q edges one would have 
0(q log 2 m). 


Table 23.5 List of Double Labels in Example 1 


Vertex 

Choice 1 
(3, 6) 

Choice 2 
(1,2) 

Choice 3 
(1,3) 

Choice 4 
(4, 5) 

Choice 5 
(3,4) 

1 


(1,0) 




2 


(1, 1) 




3 

(3, 0) 


a, 1) 



4 




(4,0) 

(1,3) 

5 




(4,4) 

(1,4) 

6 

(3, 3) 


(1,3) 
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7. CAS PROBLEM. Kruskal’s Algorithm. Write a 
corresponding program. (Sorting is discussed in Ref. 
[E25] listed in App. 1.) 


8. Design an algorithm for obtaining longest spanning 
trees. 

9. Apply the algorithm in Prob. 8 to the graph in Example 
1. Compare with the result in Example 1. 

10. To get a minimum spanning tree, instead of adding 
shortest edges, one could think of deleting longest 
edges. For what graphs would this be feasible? 
Describe an algorithm for this. 

11. Apply the method suggested in Prob. 10 to the graph 
in Example 1. Do you get the same tree? 

12. Find a shortest spanning tree in the complete graph of 
all possible 15 connections between the six cities given 
(distances by airplane, in miles, rounded). Can you 
think of a practical application of the result? 



Dallas 

Denver 

Los Angeles 

New York 

Washington, DC 

Chicago 

800 

900 

1800 

700 

650 

Dallas 


650 

1300 

1350 

1200 

Denver 



850 

1650 

1500 

Los Angeles 




2500 

2350 

New York 





200 


13. (Forest) A (not necessarily connected) graph without 
cycles is called a forest. Give typical examples of 
applications in which graphs occur that are forests or 
trees. 


14-20 

Prove: 


GENERAL PROPERTIES OF TREES 


14. (Uniqueness) The path connecting any two vertices u 
and v in a tree is unique. 

15. If in a graph any two vertices are connected by a unique 
path, the graph is a tree. 


16. If a graph has no cycles, it must have at least 2 vertices 
of degree I (definition in Sec. 23.1). 

17. A tree with exactly two vertices of degree 1 must be a 
path. 

18. A tree with n vertices has n — I edges. (Proof by 
induction.) 

19. If two vertices in a tree are joined by a new edge, a 
cycle is formed. 

20. A graph with // vertices is a tree if and only if it has 
n - 1 edges and has no cycles. 


23.5 Shortest Spanning Trees: Prim's Algorithm 

Prim’s algorithm shown in Table 23.6 is another popular algorithm for the shortest 
spanning tree problem (see Sec. 23.4). This algorithm avoids ordering edges and gives a 
tree T at each stage, a property that Kruskal’s algorithm in the last section did not have 
(look back at Fig. 490 if you did not notice it). 

In Prim’s algorithm, starting from any single vertex, which we call 1, we “grow” the 
tree T by adding edges to it, one at a time, according to some rule (in Table 23.6) until 
T finally becomes a spanning tree, which is shortest. 

We denote by U the set of vertices of the growing tree T and by S the set of its edges. 
Thus, initially U = { 1 f and 5 = 0; at the end, U — V, the vertex set of the given graph 
G = (V, E), whose edges (ij) have length l tj > 0, as before. 
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Thus at the beginning (Step 1) the labels 

A 2 , • • • , A w of the vertices 2, • • • , n 

are the lengths of the edges connecting them to vertex l (or if there is no such edge in 
G). And we pick (Step 2) the shortest of these as the first edge of the growing tree T and 
include its other end j in U (choosing the smallest j if there are several, to make the process 
unique). Updating labels in Step 3 (at this stage and at any later stage) concerns each 
vertex k not yet in U. Vertex k has label A fc = / i(fc)ffc from before. If l jk < \ kf this means 
that k is closer to the new member j just included in U than k is to its old “closest neighbor” 
i(k) in U. Then we update the label of k , replacing \ k — by A k = l jk and setting 
i(k) = j. If, however, l jk ^ A k (the old label of k ), we don’t touch the old label. Thus the 
label A k always identifies the closest neighbor of k in U , and this is updated in Step 3 as 
U and the tree T grow. From the final labels we can backtrack the final tree, and from 
their numeric values we compute the total length (sum of the lengths of the edges) of this 
tree. 


Table 23.6 Prim's Algorithm for Shortest Spanning Trees 

Bell System Technical Journal 36 (1957). 1389-1401. 

For an improved version of the algorithm, see Cheriton and Taijan, SIAM Journal on Computation 5 
(1976). 724-742. 


ALGORITHM PRIM [G = (V, E), V = {1, • ■ • , n] y for all (ij) in E] 

Given a connected graph G = (V, E) with vertices 1,2 ,•••,/? and edges (/,y) having 
length lij > 0, this algorithm determines a shortest spanning tree T in G and its length 
L(T). 

INPUT: /?. edges (i,j) of G and their lengths 

OUTPUT: Edge set S of a shortest spanning tree T in G; L(T) 

[Initially, all vertices are unlabeled.] 

1. Initial step 

Set i(k) = I, U= {1),S = 0. 

Label vertex k (= 2, • • • . n) with \ k = l ik f= ^ if G has no edge (L k)]. 

2. Addition of an edge to the tree T 

Let A j be the smallest A k for vertex k not in U. Include vertex j in U and edge 
(i(j)J) in S. 

If U = V then compute 

L(T) = 2/y (sum over all edges in S) 

OUTPUT S, L(T). Stop 

[S is the edge set of a shortest spanning tree T in G.] 

Else continue (that is, go to Step 3). 

3. Isabel updating 

For every k not in G, if l jk < A fc> then set A fe = l jk and i{k) = j. 

Go to Step 2. 

End PRIM 
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Fig. 491. Graph in Example 1 


EXAMPLE 1 Application of Prim’s Algorithm 

Find a shortest spanning tree in the graph in Fig. 491 (which is the same as in Example 1, Sec. 23.4, so that we 
can compare). 

Solution . The steps are as follows. 

1. i(k) — i, U ~ ( 1 }, S = 0, initial labels see Table 23.7. 

2. A 2 = / 12 = 2 is smallest, U * { 1, 2}, S - {(1, 2)} 

3. Update labels as shown in Table 23.7, column (I). 

2. A 3 = / 13 = 4 is smallest, U= {1,2, 3), 5= {(I, 2), (I, 3)) 

3. Update labels as shown in Table 23.7, column (11). 

2. A 6 = / 36 = 1 is smallest, U = (1, 2, 3, 6), 5 = {(1, 2), (1, 3). (3, 6)) 

3. Update labels as shown in Table 23.7, column (111). 

2. A 4 - = 8 is smallest, U = ( 1, 2, 3, 4, 6}, S = {(1, 2), (1, 3). (3, 4), (3, 6)1 

3. Update labels as shown in Table 23.7, column (IV). 

2- A 5 = / 45 = 6 is smallest, U = V, S = (1, 2), (1, 3), (3, 4), (3, 6), (4, 5). Stop. 

The tree is the same as in Example 1, Sec. 23.4. Its length is 21. You will find it interesting to compare the 

growth process of the present tree with that in Sec. 23.4. ■ 


Table 23.7 Labeling of Vertices in Example 1 


Vertex 

Initial 

Label 


Relabeling 


(I) 

to 

(HI) 

(IV) 

2 

i\2 = 2 

— 

— 

— 

— 

3 

^13 = 4 

w 

II 

— 

— 

— 

4 

oc 

'24= 11 

00 

II 

sT 

ii 

00 

— 

5 

CO 

CO 

CO 

^65 “ 9 

11 

ON 

6 

oc 

oc 

sT 

ii 

— 

— 


PROBLEM SET 23.5 
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8. (Complexity) Show that Prim’s algorithm has 
complexity 0(/? 2 ). 

9* How does Prim’s algorithm prevent the formation of 
cycles as one grows 7? 

10. For a complete graph (or one that is almost complete), 
if our data is an n X n distance table (as in Prob. 12, 
Sec. 23.4). show that the present algorithm [which is 
0(n 2 )] cannot easily be replaced by an algorithm of 
order less than 0{jn 2 ). 

11. In what case will Prim’s algorithm give S — E as the 
final result? 

12. TEAM PROJECT. Center of a Graph and Related 
Concepts, (a) Distance, eccentricity. Call the length 
of a shortest path u — » v in a graph G = (V, E) the 
distance d(u , v) from u to v. For fixed w, call the 
greatest d(u, u) as v ranges over V the eccentricity e(u) 
of u. Find the eccentricity of vertices 1, 2, 3 in the 
graph in Prob. 7. 


(b) Diameter, radius, center. The diameter d(G) of 
a graph G = (V, E) is the maximum of d(u, v) as u and 
v vary over V, and the radius r(G) is the smallest 
eccentricity e(v) of the vertices o. A vertex u with 
e(v) = r(G) is called a centra / vertex. The set of all 
central vertices is called the center of G. Find d(G ), 
r(G) and the center of the graph in Prob. 7. 

(c) What are the diameter, radius, and center of the 
spanning tree in Example I? 

(d) Explain how the idea of a center can be used in 
setting up an emergency service facility on a 
transportation network. In setting up a fire station, a 
shopping center. How would you generalize the 
concepts in the case of two or more such facilities? 

(e) Show that a tree T whose edges all have length 1 
has center consisting of either one vertex or two 
adjacent vertices. 

(f) Set up an algorithm of complexity 0(n) for finding 
the center of a tree T. 

13. What would the result be if you applied Prim’s 
algorithm to a graph that is not connected? 

14. CAS PROBLEM. Prim’s Algorithm. Write a 
program and apply it to Probs. 4-6. 


23.6 Flows in Networks 

After shortest path problems and problems for trees, as a third large area in combinatorial 
optimization we discuss flow problems in networks (electrical, water, communication, 
traffic, business connections, etc.), turning from graphs to digraphs (directed graphs; see 
Sec. 23.1). 

By definition, a network is a digraph G = (V, E) in which each edge (ij) has assigned 
to it a capacity c % > 0 [= maximum possible flow along (/, J)], and at one vertex, s, 
called the source, a flow is produced that flows along the edges of the digraph G to another 
vertex, f, called the target or sink, where the flow disappears. 

In applications, this may be the flow of electricity in wires, of water in pipes, of cars 
on roads, of people in a public transportation system, of goods from a producer to 
consumers, of e-mail from senders to recipients over the Internet, and so on. 

We denote the flow along a (directed!) edge (/, j) by and impose two conditions: 

1. For each edge (/, J) in G the flow does not exceed the capacity c. y f 

(1) 0 ^ fij ^ (“Edge condition”). 

2. For each vertex /, not s or t> 

Inflow = Outflow (“Vertex condition ” “KirchhofTs law”); 
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in a formula, 

( 2 ) 


2 hi - 2 fa 

k j 


Inflow 


Out llow 


0 if vertex / # s, / ^ /, 
< — / at the source s , 

/ at the target (sink) f, 


where / is the total flow (and at s the inflow is zero, whereas at r the outflow is zero). 
Figure 492 illustrates the notation (for some hypothetical figures). 


Fig. 492. 



Paths 

By a path v 1 — » v k from a vertex u* to a vertex v k in a digraph G we mean a sequence 
of edges 

(Ui, v 2 l (u 2 , v 3 ), • • • , (v k _ v v k f 


regardless of their directions in G, that forms a path as in a graph (see Sec. 23.2). Hence 
when we travel along this path from to u k we may traverse some edge in its given 
direction — then we call it a forward edge of our path — or opposite to its given direction — 
then we call it a backward edge of our path. In other words, our path consists of one- 
way streets, and forward edges (backward edges) are those that we travel in the right 
direction ( in the wrong direction). Figure 493 shows a forward edge (w, v) and a backward 
edge (vv, v) of a path u x — > v k . 

CAUTION! Each edge in a network has a given direction, which we cannot change . 
Accordingly, if (w, t;) is a forward edge in a path — » v k> then (w, u) can become a backward 
edge only in another path x 1 — > Xj in which it is an edge and is traversed in the opposite 
direction as one goes from x x to xf. see Fig. 494. Keep this in mind, to avoid misunderstandings. 



Fig. 493. Forward edge (u, v) and Fig. 494. Edge ( u , v) as forward edge in the path 

backward edge (w, v) of a path v } — > v k — > v k and as backward edge in the path x ^ » x f 

Flow Augmenting Paths 

Our goal will be to maximize the flow from the source s to the target t of a given network. 
We shall do this by developing methods for increasing an existing flow (including the 
special case in which the latter is zero). The idea then is to find a path P: s t all of 
whose edges are not fully used, so that we can push additional flow through P. This 
suggests the following concept. 



SEC. 23.6 Flows in Networks 


975 


DEFINITION 


EXAMPLE 1 


Flow Augmenting Path 

A flow augmenting path in a network with a given flow on each edge (i, j) is a 
path P: s—> t such that 

(i) no forward edge is used to capacity; thus for these; 

(ii) no backward edge has flow 0; thus > 0 for these. 


Flow Augmenting Paths 

Find flow augmenting paths in the network in Fig. 495, where the First number is the capacity and the second 
number a given flow. 



Fig. 495. Network in Example 1 
First number = Capacity, Second number = Given flow 


Solution . In practical problems, networks are large and one needs a systematic method for augmenting 
flows , which we discuss in the next section. In our small network, which should help to illustrate and clarify 
the concepts and ideas, we can find flow augmenting paths by inspection and augment the existing flow / = 9 
in Fig. 495. (The outflow from s is 5 + 4 = 9, which equals the inflow 6 + 3 into ;.) 

We use the notation 


A y = Cjj — fij for forward edges 

A ij = for backward edges 

A = min A# taken over all edges of a path. 

From Fig. 495 we see that a flow augmenting path P x : s — » / is P x : 1 — 2 — 3 - 6 (Fig. 496). with 

A 12 = 20 - 5 = 15. etc., and A = 3. Hence we can use P A to increase the given flow 9to/ = 9 + 3= 12. 

All three edges of P x are forward edges. We augment the flow by 3. Then the flow in each of the edges of Pi 
is increased by 3, so that we now have = 8 (instead of 5), /23 = 1 1 (instead of 8), and f 36 = 9 (instead 

of 6). Edge (2, 3) is now used to capacity. The flow in the other edges remains as before. 

We shall now try to increase the flow in this network in Fig. 495 beyond / = 12. 

There is another flow augmenting path P 2 ‘ s-* /, namely, P 2 : 1 — 4 — 5 — 3 — 6 (Fig. 496). It shows how a 
backward edge comes in and how it is handled. Edge (3, 5) is a backward edge. It has flow 2, so that A 35 = 2. 
We compute A 14 = 10 - 4 = 6, etc. (Fig. 496) and A = 2. Hence we can use P 2 for another augmentation to 
get / = 12 + 2 = 14. The new flow is shown in Fig. 497. No further augmentation is possible. We shall confirm 
later that f - 14 is maximum. M 
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THEOREM 1 


PROOF 


Cut Sets 

A “cut set” is a set of edges in a network. The underlying idea is simple and natural. If 
we want to find out what is flowing from s to / in a network, we may cut the network 
somewhere between s and / (Fig. 497 shows an example) and see what is flowing in the 
edges hit by the cut, because any flow from s to t must sometimes pass through some of 
these edges. These form what is called a cut set. [In Fig. 497, the cut set consists of the 
edges (2, 3), (5, 2), (4, 5).] We denote this cut set by ( S , T). Here S is the set of vertices 
on that side of the cut on which .? lies (5 = {.v, 2, 4) for the cut in Fig. 497) and T is the 
set of the other vertices (T = {3, 5, /} in Fig. 497). We say that a cut “partitions” the 
vertex set V into two parts S and T. Obviously, the corresponding cut set (S, T) consists 
of all the edges in the network with one end in 5 and the other end in T. 


Cut 


/ 



Fig. 497. Maximum flow in Example 1 


By definition, the capacity cap (5, T) of a cut set (5, T) is the sum of the capacities of all 
forward edges in (5, T) (forward edges only!), that is, the edges that are directed from S to T t 

(3) cap (S, T) = [sum over the forward edges of (5, T)\. 

Thus, cap (5, T) = 11 + 7 = 18 in Fig. 497. 

The other edges (directed from T to S) are called backward edges of the cut set ( S , T\ 
and by the net flow through a cut set we mean the sum of the flows in the forward edges 
minus the sum of the flows in the backward edges of the cut set. 

CAUTION! Distinguish well between forward and backward edges in a cut set and in 
a path: (5, 2) in Fig. 497 is a backward edge for the cut shown but a forward edge in the 
path 1 — 4 — 5 — 2 — 3 — 6. 

For the cut in Fig, 497 the net flow is 1 1 + 6 - 3 = 14. For the same cut in Fig. 495 (not 
indicated there), the net flow is 8 + 4 — 3 = 9. In both cases it equals the flow /. We claim 
that this is not just by chance, but cuts do serve the purpose for which we have introduced them: 


Net Flow in Cut Sets 

Any given flow in a network G is the net flow through any cut set (5, T) of G. 


By KirchhofFs law (2), multiplied by - 1, at a vertex / we have 


(4) 





Outflow 


2 fli ~ 



Inflow 


if / =£ 5, t , 
if i = s. 
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THEOREM 2 


PROOF 


THEOREM 3 


PROOF 


Here we can sum over j and Z from 1 to n (= number of vertices) by putting = 0 for 
j = / and also for edges without flow or nonexisting edges; hence we can write the two 
sums as one. 


2 (fij - fa ) = 


if i # s , r, 
if / = s . 


We now sum over all / in S. Since s is in 5, this sum equals /: 

(5) 2 2 (fij - fji ) = /• 

ieS jeV 


We claim that in this sum, only the edges belonging to the cut set contribute. Indeed, 
edges with both ends in T cannot contribute, since we sum only over i in S ; but edges 
(ij) with both ends in S contribute -4 at one end and — at the other, a total contribution 
of 0. Hence the left side of (5) equals the net flow through the cut set. By (5), this is equal 
to the flow / and proves the theorem. ■ 

This theorem has the following consequence, which we shall also need later in this section. 


Upper Bound for Flows 

A flow f in a network G cannot exceed the capacity of any cut set (5, T) in G. 


By Theorem 1 the flow f equals the net flow through the cut set, / = f x — f 2 , where f 1 
is the sum of the flows through the forward edges and f 2 (= 0) is the sum of the flows 
through the backward edges of the cut set. Thus / ^ f v Now f l cannot exceed the sum 
of the capacities of the forward edges; but this sum equals the capacity of the cut set, by 
definition. Together, f ^ cap (S, T), as asserted. ■ 

Cut sets will now bring out the full importance of augmenting paths: 


Main Theorem. Augmenting Path Theorem for Flows 

A flow from s to t in a network G is maximum if and only if there does not exist a 
flow augmenting path s — » t in G. 


(a) If there is a flow augmenting path P: s — > r, we can use it to push through it an 
additional flow. Hence the given flow cannot be maximum. 

(b) On the other hand, suppose that there is no flow augmenting path s—*t in G. Let 
Sq be the set of all vertices i (including s) such that there is a flow augmenting path s — > /, 
and let T 0 be the set of the other vertices in G. Consider any edge (/, j) with / in S 0 and 
j in T 0 . Then we have a flow augmenting path s i since i is in S 0 , but s i — » j is not 
flow augmenting because j is not in S 0 . Hence we must have 


( 6 ) 


/* = 


l 0 


if O'./) 


is a 


forward 

.backward 


edge of the path s — » / — » j. 
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Otherwise we could use (/, j) to get a flow augmenting path s — » i — » j. Now (S 0 , 7o) 
defines a cut set (since t is in 7 * 0 ; why?). Since by (6), forward edges are used to capacity 
and backward edges carry no flow, the net flow through the cut set (S 0 , T 0 ) equals the 
sum of the capacities of the forward edges, which is cap (S 0 , T 0 ) by definition. This net 
flow equals the given flow f by Theorem 1. Thus / = cap (S 0 , T 0 ). We also have 
f ^ cap (S 0 , T 0 ) by Theorem 2. Hence / must be maximum since we have reached 
equality. ■ 

The end of this proof yields another basic result (by Ford and Fulkerson, Canadian Journal 
of Mathematics 8 (1956), 399-404), namely, the so-called 


THEOREM 4 


Max-Flow Min-Cut Theorem 

The maximum flow in any network G equals the capacity of a “minimum cut set” 
(= a cut set of minimum capacity) in G. 


PROOF We have just seen that / = cap (S 0 . T 0 ) for a maximum flow f and a suitable cut set 
(•So, T 0 ). Now by Theorem 2 we also have / ^ cap (5, T) for this / and any cut set (5, T) 
in G. Together, cap (5 0 , T 0 ) ^ cap (5, T). Hence (S 0 » T 0 ) is a minimum cut set. 

The existence of a maximum flow in this theorem follows for rational capacities from 
the algorithm in the next section and for arbitrary capacities from the Edmonds-Karp BFS 
also in that section. ■ 

The two basic tools in connection with networks are flow augmenting paths and cut sets. 
In the next section we show how flow augmenting paths can be used in an algorithm for 
maximum flows. 




[mi FLOW AUGMENTING PATHS 

Find flow augmenting paths: 
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54Sj MAXIMUM FLOW 

Find the maximum flow by inspection: 

5. In Prob. 1 . 

6. In Prob. 2. 

7. In Prob. 3. 

8. In Prob. 4. 

1 9-11 1 CAPACITY 

In Fig. 495 find T and cap (5, T) if S equals 

9. {1,2,3} 

10. (1,2,4, 5} 

11. {1,3,5} 

12. Find a minimum cut set in Fig. 495 and verify that its 
capacity equals the maximum flow / = 1 4. 

13. Find examples of flow augmenting paths and the 
maximum flow in the network in Fig. 498. 

14 — 16 1 CAPACITY 

In Fig. 498 find T and cap (5, T) if S equals 

14. (1,2,4} 


23.7 Maximum Flow: Ford-Fulkerson Algorithm 

Flow augmenting paths, as discussed in the last section, are used as the basic tool in the 
Ford-Fulkerson 4 algorithm in Table 23.8 on the next page in which a given flow (for instance, 
zero flow in all edges) is increased until it is maximum. The algorithm accomplishes the 
increase by a stepwise construction of flow augmenting paths, one at a time, until no further 
such paths can be constructed, which happens precisely when the flow is maximum. 

In Step 1, an initial flow may be given. In Step 3, a vertex j can be labeled if there is 
an edge (/, j) with / labeled and 

c ij > fij (“forward edge”) 

or if there is an edge (j, /) with / labeled and 

fji > 0 (“backward edge”). 

To scan a labeled vertex / means to label every unlabeled vertex j adjacent to i that can 
be labeled. Before scanning a labeled vertex /, scan all the vertices that got labeled before 
L This BFS (Breadth First Search) strategy was suggested by Edmonds and Karp in 
1972 {Journal of the Association for Computing Machinery 19, 248-64). It has the effect 
that one gets shortest possible augmenting paths. 


4 LESTER RANDOLPH FORD (born 1927) and DELBERT RAY FULKERSON (1924-1976), American 
mathematicians known for their pioneering work on flow algorithms. 


15. {1.2,4,6} 

16. {l,2,3,4,5} 

17. In Fig. 498 find a minimum cut set and its capacity. 



Fig. 498. Problems 13-17 


18. Why are backward edges not considered in the 
definition of the capacity of a cut set? 

19. In which case can an edge (/, j) be used as a forward 
as well as a backward edge of a path in a network with 
a given flow? 

20. (Incremental network) Sketch the network in Fig. 
498, and on each edge (ij) write Cy - f# and / y . Do 
you recognize that from this “incremental network” one 
can more easily see flow augmenting paths? 
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EXAMPLE 1 


Table 23.8 Ford-Fulkerson Algorithm for Maximum Flow 

Canadian Journal of Mathematics 9 (1957). 210-218 


ALGORITHM FORD-FULKERSON 


[G = (V, E ), vertices 1 (= 5), • • • , n (= /). edges (/, 7), Cy] 

This algorithm computes the maximum flow in a network G with source s , sink /, and 
capacities > 0 of the edges (/,y). 

INPUT: n, s = 1, t = /?. edges (/, y ) of G. Cy 
OUTPUT: Maximum flow / in G 


1. Assign an initial flow (for instance, = 0 for all edges), compute /. 

2. Label s by 0. Mark the other vertices “ unlabeled . ” 

3. Find a labeled vertex / that has not yet been scanned. Scan / as follows. For every 
unlabeled adjacent vertex j, if c.y > /y, compute 


^ij c ij fij 


and Aj = 


min (Ai, Ay) 


if / = 1 
if / > 1 


and label j with a “forward label” (/'*', Aj); or if fa > 0, compute 


Aj = min (A*, f 3i ) 

and label j by a '‘backward label” (/“, Aj). 

If no such j exists then OUTPUT /. Stop 
[/ is the maximum flow.] 

Else continue (that is, go to Step 4). 

4. Repeat Step 3 until t is reached. 

[This gives a flow augmenting path P: s — > /.] 

If it is impossible to reach / then OUTPUT /. Stop 
[/ is the maximum flow.] 

Else continue (that is, go to Step 5). 

5. Backtrack the path P, using the labels. 

6. Using P, augment the existing flow by A f . Set / = / + A t . 

7. Remove all labels from vertices 2, • • • , n. Go to Step 3. 
End FORD-FULKERSON 


Ford-Fulkerson Algorithm 

Applying the Ford-Fulkerson algorithm, determine the maximum flow for the network in Fig. 499 (which is 
the same as that in Example I . Sec. 23.6, so that we can compare). 

Solution. The algorithm proceeds as follows. 

1. An initial flow / = 9 is given. 

2. Label s (= I) by 0. Mark 2. 3, 4, 5, 6 “un labeled.” 
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Fig. 499. Network in Example 1 with capacities (first numbers) and given flow 

3. Scan 1 . 

Compute A 12 = 20 - 5 = 15 = A 2 . Label 2 by (l*. 15). 

Compute A 14 = 10 - 4 = 6 = A 4 . Label 4 by (I + , 6). 

4. Scan 2. 

Compute A93 = 11 - 8 = 3, A 3 = min (A 2 . 3) = 3. Label 3 by (2 + , 3). 

Compute A 5 = min (A 2 , 3) = 3. Label 5 by (2“, 3). 

Scan 3. 

Compute A 36 = 13 — 6 = 7, A 6 = A t = min (A 3 , 7) = 3. Label 6 by (3*. 3). 

5. P: 1 - 2 - 3 - 6 (= /) is a flow augmenting path. 

6. A f = 3. Augmentation gives / 12 = 8, f 2 3 = II, /3s = 9, other fcj unchanged. Augmented flow 
/ = 9 + 3 = 12. 

7. Remove labels on vertices 2, • • • . 6. Go to Step 3. 

3. Scan 1 . 

Compute A 12 = 20 - 8 = 12 = A 2 . Label 2 by (1 + . 12). 

Compute A 14 = 10 — 4 = 6 = A 4 . Label 4 by (1 + . 6). 

4. Scan 2. 

Compute A 5 = min (A 2 . 3) = 3. Label 5 by (2”, 3). 

Scan 4. [No vertex left for labeling.] 

Scan 5. 

Compute A 3 = min (A 5 . 2) = 2. Label 3 by (5“. 2). 

Scan 3. 

Compute A 36 = 13 - 9 = 4. A 6 = min (A 3 , 4) = 2. Label 6 by (3*, 2). 

5. P: 1 — 2 — 5 — 3 — 6 (= /) is a flow augmenting path. 

6. A f = 2. Augmentation gives ,f 12 = 10, / 52 = 1. / 3 5 = 0. / 36 =11, other unchanged. Augmented 
flow / = 12 + 2 = 14. 

7. Remove labels on vertices 2, • • ■ , 6. Go to Step 3. 

One can now scan 1 and then scan 2, as before, bui in scanning 4 and then 5 one finds that no vertex is left for 
labeling. Thus one can no longer reach t. Hence the flow obtained (Fig. 500) is maximum, in agreement with 
our result in the last section. I 
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zeassssi 



1. Do the computations indicated near the end of Example 
1 in detail. 

2. Solve Example 1 by Ford-Fulkerson with initial flow 
0. Is it more work than in Example 1 ? 

3. Which are the “bottleneck” edges by which the flow 
in Example 1 is actually limited? Hence which 
capacities could be decreased without decreasing the 
maximum flow? 

[4I7] MAXIMUM FLOW 

Find the maximum flow by Ford-Fulkerson: 

4. In Prob. 2, Sec. 23.6. 

5. In Prob. 1, Sec. 23.6. 

6. In Prob. 4, Sec. 23.6. 

7. In Prob. 3, Sec. 23.6. 

8. What is the (simple) reason that Kirchhoff s law is 
preserved in augmenting a flow by the use of a flow 
augmenting path? 

9. How does Ford-Fulkerson prevent the formation of 
cycles? 

10. How can you see that Ford-Fulkerson follows a BFS 
technique? 

11. Are the consecutive flow augmenting paths produced 
by Ford-Fulkerson unique? 

12. (Integer flow theorem) Prove that if the capacities in 
a network G are integers, then a maximum flow exists 
and is an integer. 

13. CAS PROBLEM. Ford-Fulkerson. Write a program 
and apply it to Probs. 4-7. 


14. If the Ford-Fulkerson algorithm stops without reaching 

show that the edges with one end labeled and the 
other end unlabeled form a cut set (S, T) whose 
capacity equals the maximum flow. 

15. (Several sources and sinks) If a network has several 

sources s l9 • • • . s k . show that it can be reduced to the 
case of a single-source network by introducing a new 
vertex s and connecting s to s x , • * * , s k by k edges of 
capacity Similarly if there are several sinks. Illustrate 

this idea by a network with two sources and two sinks. 

16. Find the maximum flow in the network in Fig. 501 with 
two sources (factories) and two sinks (consumers). 

17. Find a minimum cut set in Fig. 499 and its capacity. 

18. Show that in a network G with all = 1 , the maximum 
flow equals the number of edge-disjoint paths s — > /. 

19. In Prob. 17, the cut set contains precisely all forward 
edges used to capacity by the maximum flow 
(Fig. 500). Is this just by chance? 

20. Show that in a network G with capacities all equal to 
1 , the capacity of a minimum cut set (5, T) equals the 
minimum number q of edges whose deletion destroys 
all directed paths s — > 1 . (A directed path u — » w is a 
path in which each edge has the direction in which it 
is traversed in going from v to w.) 



23.8 Bipartite Graphs. Assignment Problems 

From digraphs we return to graphs and discuss another important class of combinatorial 
optimization problems that arises in assignment problems of workers to jobs, jobs to 
machines, goods to storage, ships to piers, classes to classrooms, exams to time periods, 
and so on. To explain the problem, we need the following concepts. 

A bipartite graph G = (V, E) is a graph in which the vertex set V is partitioned into two 
sets S and T (without common elements, by the definition of a partition) such that every 
edge of G has one end in S and the other in 7. Hence there are no edges in G that have both 
ends in S or both ends in T. Such a graph G = (V, E) is also written G = (. S , T; E). 

Figure 502 shows an illustration. V consists of seven elements, three workers a , b, c, 
making up the set 5, and four jobs 1, 2, 3, 4, making up the set T. The edges indicate that 
worker a can do the jobs 1 and 2, worker b the jobs 1, 2, 3, and worker c the job 4. The 
problem is to assign one job to each worker so that every worker gets one job to do. This 
suggests the next concept, as follows. 
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DEFINITION 


Maximum Cardinality Matching 

A matching in G = (S, T; E) is a set M of edges of G such that no two of them 
have a vertex in common. If M consists of the greatest possible number of edges, 
we call it a maximum cardinality matching in G. 


For instance, a matching in Fig. 502 is M 1 = {(a, 2), (ft, 1)}. Another is M 2 = {( a , 1), 
(ft, 3), (c, 4)}; obviously, this is of maximum cardinality. 


S T 



Fig. 502. Bipartite graph in the assignment 
of a set S = {a, ft, c) of workers 
to a set T = {1, 2, 3, 4} of jobs 


A vertex v is exposed (or not covered) by a matching M if v is not an endpoint of an 
edge of M. This concept, which always refers to some matching, will be of interest when 
we begin to augment given matchings (below). If a matching leaves no vertex exposed, 
we call it a complete matching. Obviously, a complete matching can exist only if S and 
T consist of the same number of vertices. 

We now want to show how one can stepwise increase the cardinality of a matching M 
until it becomes maximum. Central in this task is the concept of an augmenting path. 

An alternating path is a path that consists alternately of edges in M and not in M 
(Fig. 503A). An augmenting path is an alternating path both of whose endpoints (a and ft 
in Fig. 503B) are exposed. By dropping from the matching M the edges that are on an 
augmenting path P (two edges in Fig. 503B) and adding to M the other edges of P (three 
in the figure), we get a new matching, with one more edge than M. This is how we use 
an augmenting path in augmenting a given matching by one edge. We assert that this 
will always lead, after a number of steps, to a maximum cardinality matching. Indeed, the 
basic role of augmenting paths is expressed in the following theorem. 



(A) Alternating path 



a 


(B) Augmenting path P 

Fig. 503. Alternating and augmenting paths. 
Heavy edges are those belonging to a matching M. 
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THEOREM 1 


PROOF 


Augmenting Path Theorem for Bipartite Matching 

A matching Af in a bipartite graph G = (5. T; E) is of maximum cardinality if and 
only if there does not exist an augmenting path P with respect to M. 


(a) We show that if such a path P exists, then Af is not of maximum cardinality. Let P 
have q edges belonging to Af. Then P has q + 1 edges not belonging to A/. (In Fig. 503B 
we have q = 2.) The endpoints a and b of P are exposed, and all the other vertices on P 
are endpoints of edges in A/, by the definition of an alternating path. Hence if an edge of 
Af is not an edge of P, it cannot have an endpoint on P since then M would not be a 
matching. Consequently, the edges of M not on P, together with the </ 4- 1 edges of P not 
belonging to Af form a matching of cardinality one more than the cardinality of Af because 
we omitted q edges from Af and added q + 1 instead. Hence Af cannot be of maximum 
cardinality. 

(b) We now show that if there is no augmenting path for M , then Af is of maximum 
cardinality. Let Af* be a maximum cardinality matching and consider the graph H 
consisting of all edges that belong either to Af or to A/*, but not to both. Then it is possible 
that two edges of H have a vertex in common, but three edges cannot have a vertex in 
common since then two of the three would have to belong to Af (or to Af*), violating that 
Af and Af* are matchings. So every v in V can be in common with two edges of H or with 
one or none. Hence we can characterize each “component” (= maximal connected subset) 
of H as follows. 

(A) A component of H can be a closed path with an even number of edges (in the case 
of an odd number, two edges from Af or two from Af* would meet, violating the matching 
property). See (A) in Fig. 504. 

(B) A component of H can be an open path P with the same number of edges from Af 

and edges from Af*, for the following reason. P must be alternating, that is, an edge of 
Af is followed by an edge of Af*, etc. (since Af and Af* are matchings). Now if P had an 
edge more from Af*, then P would be augmenting for M [see (B2) in Fig. 504], 
contradicting our assumption that there is no augmenting path for Af. If P had an edge 
more from Af, it would be augmenting for Af* [see (B3) in Fig. 504], violating the 
maximum cardinality of Af*, by part (a) of this proof. Hence in each component of /f, die 
two matchings have the same number of edges. Adding to this the number of edges that 
belong to both Af and Af* (which we left aside when we made up //), we conclude that 
Af and Af* must have the same number of edges. Since Af* is of maximum cardinality, 
this shows that die same holds for Af, as we wanted to prove. ■ 


(A) 


<B1> 

(B2) 

(B3) 


- 8 *- — 


Edge from M 
■ - - - Edge from M* 


(Possible) 
(Augmenting for M) 
(Augmenting for M *) 


Fig. 504. Proof of the augmenting path theorem for bipartite matching 
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This theorem suggests the algorithm in Table 23.9 for obtaining augmenting paths, in 
which vertices are labeled for the purpose of backtracking paths. Such a label is in addition 
to the number of the vertex, which is also retained. Clearly, to get an augmenting path, 
one must start from an exposed vertex, and then trace an alternating path until one arrives 
at another exposed vertex. After Step 3 all vertices in S are labeled. In Step 4, the set T 
contains at least one exposed vertex, since otherwise we would have stopped at Step 1 . 


Table 23.9 Bipartite Maximum Cardinality Matching 

ALGORITHM MATCHING [G = (S, T; £), M, n] 

This algorithm determines a maximum cardinality matching M in a bipartite graph G by 
augmenting a given matching in G. 

INPUT: Bipartite graph G = (5, T; E) with vertices 1, • • • , ;i, matching M in G (for 
instance, M = 0) 

OUTPUT: Maximum cardinality matching M in G 

1. If there is no exposed vertex in S then 

OUTPUT M. Slop 

\M is of maximum cardinality in G.] 

Else label all exposed vertices in S with 0. 

2. For each / in 5 and edge (/, j) not in M, label j with /, unless already labeled. 

3. For each nonexposed j in T , label / with j, where / is the other end 

of the unique edge (ij) in M. 

4. Backtrack the alternating paths P ending on an exposed vertex in T 

by using the labels on the vertices. 

5. If no P in Step 4 is augmenting then 

OUTPUT M. Stop 

[M is of maximum cardinality in G.] 

Else augment M by using an augmenting path P. 

Remove all labels. 

Go to Step I . 

End MATCHING 


EXAMPLE 1 Maximum Cardinality Matching 

Is the matching in Fig. 505a of maximum cardinality? If not. augment it until maximum cardinality is reached. 



(a) Given graph (b) Matching M 2 

and matching Mj and new labels 


Fig. 505. Example 1 
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Solution. We apply the algorithm. 

1. Label 1 and 4 with 0. 

2. Label 7 with I. Label 5, 6, 8 with 3. 

3. Label 2 with 6. and 3 with 7. 

[All vertices are now labeled as shown in Fig . 474a.] 

4. P x : 1 — 7 — 3 — 5. [fl>’ backtracking, P^ is augmenting. ] 

P 2 . I - 7 - 3 - 8. [P 2 is augmenting.] 

5. Augment M t by using P\. dropping (3. 7) from Mi and including (1,7) and (3. 5). 

Remove all labels. Go to Step 1. 

Figure 474b shows the resulting matching M 2 — 1(1.7), (2, 6), (3, 5)}. 

1. Label 4 with 0. 

2. Label 7 with 2. Label 6 and 8 with 3. 

3. Label 1 with 7. and 2 with 6, and 3 with 5. 

4. 5 — 3 — 8. [P 2 is alternating but not augmenting.) 

5. Stop. M 2 is of maximum cardinality (namely, 3). M 



[mT| bipartite or not? 

Are the following graphs bipartite? If you answer is yes, 
find S and T. 



7. Can you obtain the answer to Prob. 3 from that to 
Prob. 1? 



11-13 


MAXIMUM CARDINALITY MATCHING 


Augmenting the given matching, find a maximum 
cardinality matching: 


11. In Prob. 9. 


12. In Prob. 8, 

13. In Prob. 10. 
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14. (Scheduling and matching) Three teachers a* 2 > 
teach four classes y 1? y 2 , V 4 for these numbers of 
periods: 



.Vi 

?2 

.Vs 

.V 4 

*1 

1 

0 

1 

1 

x 2 

1 

1 

1 

1 

*3 

0 

1 

1 

1 


Show that this arrangement can be represented by a 
bipartite graph G and that a teaching schedule for one 
period corresponds to a matching in G. Set up a 
teaching schedule with the smallest possible number of 
periods. 

15. (Vertex coloring and exam scheduling) What is 
the smallest number of exam periods for six subjects 
a , b, c, cl , e, f if some of the students simultaneously 
take a , /;, /, some c, d , e , some a , c\ e, and some c\ e? 
Solve this as follows. Sketch a graph with six vertices 
a , • • • , / and join vertices if they represent subjects 
simultaneously taken by some students. Color the 
vertices so that adjacent vertices receive different 
colors. (Use numbers 1, 2, • • • instead of actual colors 
if you want.) What is the minimum number of colors 
you need? For any graph G, this minimum number is 
called the (vertex) chromatic number ^(G ). Why is 
this the answer to the problem? Write down a possible 
schedule. 

16. How many colors do you need in vertex coloring the 
graph in Prob. 5? 

17. Show that all trees can be vertex colored with two 
colors. 

18. (Harbor management) How many piers does a 
harbormaster need for accommodating six cruise ships 
Si, 1 2 3 4 * • , Sq with expected dates of arrival A and 
departure D in July, (A, D) — (10, 13), (13, 15), 
(14, 17), (12, 15), (16, 18), (14, 17), respectively, if 
each pier can accommodate only one ship, arrival being 
at 6 a:m and departures at 1 1 p:m? Hint. Join 5, and Sj 
by an edge if their intervals overlap. Then color 
vertices. 

19. What would be the answer to Prob. 18 if only the five 


ships Si, 4 4 4 , S 5 had to be accommodated? 

20. (Complete bipartite graphs) A bipartite graph 
G = (5, T; E) is called complete if every vertex in S 
is joined to every vertex in T by an edge, and is denoted 
by Kn L ,n 2 * where n x and n 2 are the numbers of vertices 
in S and 7, respectively. How many edges does this 
graph have? 

21. (Planar graph) A planar graph is a graph that can be 
drawn on a sheet of paper so that no two edges cross. 
Show that the complete graph K 4 with four vertices is 
planar. The complete graph K 5 with five vertices is not 
planar. Make this plausible by attempting to draw K 5 
so that no edges cross. Interpret the result in terms of 
a net of roads between five cities. 

22. (Bipartite graph K 3>3 not planar) Three factories 1 , 
2, 3 are each supplied underground by water, gas, and 
electricity, from points A, /?, C, respectively. Show that 
this can be represented by K 3%3 (the complete bipartite 
graph G = ( S , T: E) with S and T consisting of three 
vertices each) and that eight of the nine supply lines 
(edges) can be laid out without crossing. Make it 
plausible that K 3 3 is not planar by attempting to draw 
the ninth line without crossing the others. 

23. (Four- (vertex) color theorem) The famous/<?z<r-co>/ 0 r 
theorem states that one can color the vertices of any 
planar graph (so that adjacent vertices get different 
colors) with at most four colors. It had been conjectured 
for a long time and was eventually proved in 1976 
by Appel and Haken [Illinois J. Math 21 (1977), 
429-567]. Can you color the complete graph K 5 with 
four colors? Does the result contradict the four-color 
theorem? (For more details, see Ref. [F 8 ] in App. 1 .) 

24. (Edge coloring) The edge chromatic number ^(G) of 
a graph G is the minimum number of colors needed for 
coloring the edges of G so that incident edges get 
different colors. Clearly, x e (G) ^ max d{u), where cl(u) 
is the degree of vertex u. If G = (5, T; E) is bipartite, 
the equality sign holds. Prove this for K n n . 

25. Vizing’s theorem states that for any graph G (without 
multiple edges!), max d(u) ^ * e (G) ^ max d(u) + 1. 
Give an example of a graph for which * e (G) does 
exceed max d(n). 


1. What is a graph? A digraph? A tree? A cycle? A path? 

2. State from memory how you can handle graphs and 
digraphs on computers. 

3. Describe situations and problems that can be modeled 
using graphs or digraphs. 

4. What is a shortest path problem? Give applications. 


STIONS AND PROBLEMS 


5. What is BFS? DFS? In what connection did these 
concepts occur? 

6 . Give some applications in which spanning trees play a 
role. 

7. What are bipartite graphs? What applications motivate 
this concept? 
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8. What is the traveling salesman problem? 

9. What is a network? What optimization problems are 
connected with it? 

10. Can a forward edge in one path be a backward edge in 
another path? In a cut set? Explain. 

11. There is a famous theorem on cut sets. Can you 
remember and explain it? 

1 12—17 1 MATRICES FOR GRAPHS OR DIGRAPHS 


Find the adjacency matrix of: 



18-20 1 GIVEN ADJACENCY MATRIX 

Sketch the graph whose adjacency matrix is: 



21. Make a vertex incidence list of the digraph in Prob. 13. 

22. Make a vertex incidence list of the digraph in Prob. 14. 

23-28 1 SHORTEST PATHS 

Find a shortest path and its length by Moore’s BFS 
algorithm, assuming that all the edges have length 1 : 




29. (Shortest spanning tree) Find a shortest spanning tree 
for the graph in Prob. 26. 

30. Find a shortest spanning tree in Prob. 27. 

31. Cayley’s theorem states that the number of spanning 
trees in a complete graph with n vertices is n n “ 2 . Verify 
this for n = 2, 3, 4. 

32. Show that 0(m 3 ) -I- 0(m z ) = 0(m 3 ). 

33-34[ MAXIMUM FLOW. 

Find the maximum flow, where the given numbers are 

capacities: 



35. Company A has offices in Chicago, Los Angeles, and 
New York, Company B in Boston and New York, 
Company C in Chicago, Dallas, and Los Angeles. 
Represent this by a bipartite graph. 

36. (Maximum cardinality matching). Augmenting the 
given matching, find a maximum cardinality matching: 

(H (D (D 


4 ) ( 5 ) 16 
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: SUMMARY OF CHAPTER 23 
Graphs and Combinatorial Optimization 


Combinatorial optimization concerns optimization problems of a discrete or 
combinatorial structure. It uses graphs and digraphs (Sec. 23.1) as basic tools. 

A graph G = ( V 9 E) consists of a set V of vertices u l5 v 2 > • • * , v n > (often simply 
denoted by 1, 2, • • • , /?) and a set E of edges e l9 e 2 , * * * , e ni , each of which connects 
two vertices. We also write (7, j) for an edge with vertices / and j as endpoints. A 
digraph (= directed graph) is a graph in which each edge has a direction (indicated 
by an arrow). For handling graphs and digraphs in computers, one can use matrices 
or lists (Sec. 23. 1 ). 

This chapter is devoted to important classes of optimization problems for graphs 
and digraphs that all arise from practical applications, and corresponding algorithms, 
as follows. 

In a shortest path problem (Sec. 23.2) we determine a path of minimum length 
(consisting of edges) from a vertex s to a vertex t in a graph whose edges (i\j) have 
a “length’' l,j > 0, which may be an actual length or a travel time or cost or an 
electrical resistance [if (/, j) is a wire in a net], and so on. Dijkstra’s algorithm 
(Sec. 23.3) or, when all Zy = 1, Moore’s algorithm (Sec. 23.2) are suitable for 
these problems. 

A tree is a graph that is connected and has no cycles (no closed paths). Trees are 
very important in practice. A spanning tree in a graph G is a tree containing all the 
vertices of G. If the edges of G have lengths, we can determine a shortest spanning 
tree, for which the sum of the lengths of all its edges is minimum, by Kruskal’s 
algorithm or Prim’s algorithm (Secs. 23.4, 23.5). 

A network (Sec. 23.6) is a digraph in which each edge (/, j) has a capacity 
t'ij > 0 [= maximum possible flow along (/, j)\ and at one vertex, the source s , a 
flow is produced that flows along the edges to a vertex t . the sink or target , where 
the flow disappears. The problem is to maximize the flow, for instance, by applying 
the Ford-Fulkerson algorithm (Sec. 23.7), which uses flow augmenting paths 
(Sec. 23.6). Another related concept is that of a cut set, as defined in Sec. 23.6. 

A bipartite graph G = (V7 E) (Sec. 23.8) is a graph whose vertex set V consists 
of two parts S and 7 such that every edge of G has one end in S and the other in 7, 
so that there are no edges connecting vertices in S or vertices in 7. A matching in 
G is a set of edges, no two of which have an endpoint in common. The problem 
then is to Find a maximum cardinality matching in G, that is, a matching M that 
has a maximum number of edges. For an algorithm, see Sec. 23.8. 
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CHAPTER 24 Data Analysis* Probability Theory 

CHAPTER 25 Mathematical Statistics 

Probability theory (Chap. 24) provides models of probability distributions (theoretical 
models of the observable reality involving chance effects) to be tested by statistical 
methods, and it will also supply the mathematical foundation of these methods in Chap. 25. 

Modern mathematical statistics (Chap. 25) has various engineering applications, for 
instance, in testing materials, control of production processes, quality control of production 
outputs, performance tests of systems, robotics, and automatization in general, production 
planning, marketing analysis, and so on. 

To this we could add a long list of fields of applications, for instance, in agriculture, 
biology, computer science, demography, economics, geography, management of natural 
resources, medicine, meteorology, politics, psychology, sociology, traffic control, urban 
planning, etc. Although these applications are very heterogeneous, we shall see that most 
statistical methods are universal in the sense that each of them can be applied in various 
fields. 

Additional Software for Probability and Statistics 

See also the list of software at the beginning of Part E on Numerical Analysis. 

DATA DESK. Data Description, Inc., Ithaca, NY. Phone 1-800-573-5121 or 
(607) 257-1000, website at www.datadescription.com. 

MINITAB. Minitab, Inc., College Park, PA. Phone 1-800-448-3555 or (814) 238-3280, 
website at www.minitab.com. 

SAS. SAS Institute, Inc., Cary, NC. Phone 1-800-727-0025 or (919) 677-8000, website 
at www.sas.com. 
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S-PLUS. Insightful Corporation, Inc., Seattle, WA. Phone 1-800-569-0123 or 
(206) 283-8802, website at www.insightful.com. 

SPSS. SPSS, Inc., Chicago, IL. Phone 1-800-543-2185 or (312) 651-3000, website at 
www.spss.com. 

STATISTICA. StatSoft, Inc., Tulsa, OK. Phone (918) 749-1119, website at 
www.statsoft.com. 



CHAPTER 2 4 



Data Analysis. 
Probability Theory 


We first show how to handle data numerically or in terms of graphs, and how to extract 
information (average size, spread of data, etc.) from them. If these data are influenced by 
“chance/’ by factors whose effect we cannot predict exactly (e.g., weather data, stock 
prices, lifespans of tires, etc.), we have to rely on probability theory. This theory 
originated in games of chance, such as flipping coins, rolling dice, or playing cards. 
Nowadays it gives madiematical models of chance processes called random experiments 
or, briefly, experiments. In such an experiment we observe a random variable X , that 
is, a function whose values in a trial (a performance of an experiment) occur “by chance” 
(Sec. 24.3) according to a probability distribution that gives the individual probabilities 
with which possible values of X may occur in the long run. (Example: Each of the six 
faces of a die should occur with the same probability. 1/6.) Or we may simultaneously 
observe more than one random variable, for instance, height and weight of persons or 
hardness and tensile strength of steel. This is discussed in Sec. 24.9, which will also give 
the basis for the mathematical justification of the statistical methods in Chap. 25. 

Prerequisite: Calculus. 

References and Answers to Problems: App. 1 , Part G, App. 2. 

24.1 Data Representation. Average. Spread 

Data can be represented numerically or graphically in various ways. For instance, your daily 
newspaper may contain tables of stock prices and money exchange rates, curves or bar charts 
illustrating economical or political developments, or pie charts showing how your tax dollar 
is spent. And there are numerous other representations of data for special purposes. 

In this section we discuss the use of standard representations of data in statistics. (For 
these, software packages, such as DATA DESK and MINITAB, are available, and Maple 
or Mathematica may also be helpful; see pp. 778 and 991) We explain corresponding 
concepts and methods in terms of typical examples, beginning with 

(1) 89 84 87 81 89 86 91 90 78 89 87 99 83 89. 

These are n = 14 measurements of the tensile strength of sheet steel in kg/mm 2 , recorded 

in the order obtained and rounded to integer values. To see what is going on, we sort 

these data, that is, we order them by size, 

(2) 78 81 83 84 86 87 87 89 89 89 89 90 91 99. 

Sorting is a standard process on the computer; see Ref. [E25], listed in App. 1. 
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Graphic Representation of Data 

We shall now discuss standard graphic representations used in statistics for obtaining 
information on properties of data. 

Stem-and-Leaf Plot 

This is one of the simplest but most useful representations of data. For (1) it is shown in 
Fig. 506. The numbers in (1) range from 78 to 99; see (2). We divide these numbers into 
5 groups, 75-79, 80-84, 85-89, 90-94, 95-99. The integers in the tens position of the 
groups are 7, 8, 8, 9, 9. These form the stem in Fig. 506. The first leaf is 8 (representing 
78). The second leaf is 134 (representing 81, 83, 84), and so on. 

The number of times a value occurs is called its absolute frequency. Thus 78 has 
absolute frequency 1, the value 89 has absolute frequency 4, etc. The column to the extreme 
left in Fig. 506 shows the cumulative absolute frequencies, that is, the sum of the absolute 
frequencies of the values up to the line of the leaf. Thus, the number 4 in the second line 
on die left shows that (1) has 4 values up to and including 84. The number 1 1 in the next 
line shows that there are 1 1 values not exceeding 89, etc. Dividing the cumulative absolute 
frequencies by n (= 14 in Fig. 506) gives the cumulative relative frequencies. 

Histogram 

For large sets of data, histograms are better in displaying the distribution of data than 
stem-and-leaf plots. The principle is explained in Fig. 507. (An application to a larger 
data set is shown in Sec. 25.7). The bases of the rectangles in Fig. 507 are the ^-intervals 
(known as class intervals) 74.5-79.5, 79.5-84.5, 84.5-89.5, 89.5-94.5, 94.5-99.5, whose 
midpoints (known as class marks) are a- = 77, 82, 87, 92, 97, respectively. The height 
of a rectangle with class mark x is the relative class frequency / re i(a), defined as the 
number of data values in that class interval, divided by n (= 14 in our case). Hence the 
areas of the rectangles are proportional to these relative frequencies, so that histograms 
give a good impression of the distribution of data. 

Center and Spread of Data: Median, Quartiles 

As a center of the location of data values we can simply take the median, the data value 
that falls in the middle when the values are ordered. In (2) we have 14 values. The seventh 
of them is 87, the eighth is 89, and we split the difference, obtaining the median 88. (In 
general, we would get a fraction.) 

The spread (variability) of the data values can be measured by the range R = A max — A min , 
the largest minus the smallest data values, R = 99 — 78 = 21 in (2). 
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of the data in (1) and (2) 

Fig. 507. Histogram of the data in 
(1) and (2) (grouped as in Fig. 506) 
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Better information gives the interquartile range IQR = q v — q L > Here the upper 
quartile q v is the middle value among the data values above the median. The lower 
quartile q L is the middle value among the data values below the median. Thus in (2) we 
have q v = 89 (the fourth value from the end), q L = 84 (the fourth value from the 
beginning), and IQR = 89 — 84 = 5. The median is also called the middle quartile and 
is denoted by q M . The rule of “splitting the difference” (just applied to the middle quartile) 
is equally well used for the other quartiles if necessary. 

Boxplot 

The boxplot of (1) in Fig. 508 is obtained from the five numbers jc min , q L , q M , q v , x max 
just determined. The box extends from q L to q v . Hence it has the height IQR. The position 
of the median in the box shows that the data distribution is not symmetric. The two lines 
extend from the box to x min below and to x max above. Hence they mark the range R. 

Boxplots are particularly suitable for making comparisons. For example. Fig. 508 shows 
boxplots of the data sets (1) and 

(3) 91 89 93 91 87 94 92 85 91 90 96 93 89 

(consisting of n = 13 values). Ordering gives 

(4) 85 87 89 89 90 91 91 91 92 93 93 94 96 

(tensile strength, as before). From the plot we immediately see that the box of (3) is shorter 
than the box of (1) (indicating the higher quality of the steel sheets!) and that q M is located 
in the middle of the box (showing the more symmetric form of the distribution). Finally, 
Xmax IS c I° ser to c lv f° r (3) than it is for (1), a fact that we shall discuss later. 

For plotting the box of (3) we took from (4) the values x min = 85, q L = 89, q M = 91, 
tfu 93, -r max 96. 

Outliers 

An outlier is a value that appears to be uniquely different from the rest of the data set. It 
might indicate that something went wrong with the data collection process. In connection 
with quartiles an outlier is conventionally defined as a value more than a distance of 1.5 
IQR from either end of the box. 
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Fig. 508. Boxplots of data sets (1) and (3) 
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For the data in (1) we have IQR = 5, q L = 84, q v = 89. Hence outliers are smaller 
than 84 — 7.5 or larger than 89 + 7.5, so that 99 is an outlier [see (2)]. The data (3) have 
no outliers, as you can readily verify. 

Mean. Standard Deviation. Variance 

Medians and quartiles are easily obtained by ordering and counting, practically without 
calculation. But they do not give full information on data: you can change data values to 
some extent without changing the median. Similarly for the quartiles. 

The average size of the data values can be measured in a more refined way by the mean 

1 n 1 

(5) X = — X Xj = — (*1 + *2 + • • • + •*'»)• 

n . , n 

j=i 

This is the arithmetic mean of the data values, obtained by taking their sum and dividing 
by the data size n. Thus in (1), 

x - ^ (89 + 84 + • • • + 89) = ^ = 87.3. 

Every data value contributes, and changing one of them will change the mean. 

Similarly, the spread (variability) of the data values can be measured in a more refined 
way by the standard deviation s or by its square, the variance 

(6) s 2 = — - 2 (• xj - xf = — [(a, - a -) 2 + • • • + (x n - a) 2 ]. 

n ~ j=i n ~ 1 

Thus, to obtain the variance of the data, take the difference xj — x of each data value from 
the mean, square it, take the sum of these n squares, and divide it by /? — 1 (not n , as we 
motivate in Sec. 25.2). To get the standard deviation s , take the square root of s 2 . 

For example, using x = 61 1/7, we get for the data (1) the variance 

s 2 = ^ [(89 - SLL) 2 + (84 - ^) 2 + • • • + (89 - W] = ^ ~ 25.14. 

Hence the standard deviation is s = V176/7 » 5.014. Note that the standard deviation 
has the same dimension as the data values (kg/mm 2 , see at the beginning), which is an 
advantage. On the other hand, the variance is preferable to the standard deviation in 
developing statistical methods, as we shall see in Chap. 25. 

CAUTION! Your CAS (Maple, for instance) may use 1 In instead of \/(n — 1) in (6), 
but the latter is better when n is small (see Sec. 25.2). 


aaaBiBiBa 

|l-ll>l DATA REPRESENTATIONS 

Represent the data by a stem-and-leaf plot, a histogram, and 
a boxplot: 

1. 20 21 20 19 20 19 21 19 

2. 7 6 407 1 2466 



3. 56 58 54 33 41 30 44 37 51 46 56 

38 38 49 39 

4 . 12.1 10 12.4 10.5 9.2 17.2 11.4 11.8 

14.7 9.9 

5. 70.6 70.9 69.1 71.3 70.5 69.7 71.5 69.8 

71.1 68.9 70.3 69.2 71.2 70.4 72.8 
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6. -0.52 0.11 -0.48 0.94 0.24 -0.19 -0.55 

7. Reaction time [sec] of an automatic switch 

2.3 2.2 2.4 2.5 2.3 2.3 2.4 2.1 2.5 2.4 

2.6 2.3 2.5 2.1 2.4 2.2 2.3 2.5 2.4 2.4 

8. Carbon content [%] of coal 

89 90 89 84 80 88 90 89 88 90 85 

87 86 82 85 76 89 87 86 86 

9. Weight of filled bottles [g] in an automatic filling process 

403 399 398 401 400 401 401 

10. Gasoline consumption [gallons per mile] of six cars of 
the same model 

14.0 14.5 13.5 14.0 14.5 14.0 

AVERAGE AND SPREAD 

Find the mean and compare it with the median. Find die 
standard deviation and compare it with die interquartile range. 


11. The data in Prob. 1. 

12. The data in Prob. 2. 

13. The data in Prob. 5. 

14. The data in Prob. 6. 

15. The data in Prob. 9. 

16. 5 22 7 23 6. Why is \x - cj M \ so large? 

17. Construct the simplest possible data with x = 100 but 
tffti = 0 . 

18. (Mean) Prove that x must always lie between the 
smallest and die largest data values. 

19. (Outlier, reduced data) Calculate s for the data 

4 1 3 10 2. Then reduce the data by deleting 

the outlier and calculate s. Comment. 

20. WRITING PROJECT. Average and Spread. 
Compare Q M , IQR and .v, s, illustrating the advantages 
and disadvantages with examples of your own. 


24 .2 Experiments, Outcomes, Events 

We now turn to probability theory. This theory has the purpose of providing mathematical 
models of situations affected or even governed by “chance effects,’' for instance, in weather 
forecasting, life insurance, quality of technical products (computers, batteries, steel sheets, 
etc.), traffic problems, and, of course, games of chance with cards or dice. And the accuracy 
of these models can be tested by suitable observations or experiments — this is a main 
purpose of statistics to be explained in Chap. 25. 

We begin by defining some standard terms. An experiment is a process of measurement 
or observation, in a laboratory, in a factory, on the street, in nature, or wherever; so 
“experiment” is used in a rather general sense. Our interest is in experiments that involve 
randomness, chance effects, so that we cannot predict a result exactly. A trial is a single 
performance of an experiment. Its result is called an outcome or a sample point, n trials 
then give a sample of size n consisting of n sample points. The sample space S of an 
experiment is the set of all possible outcomes. 

EXAMPLES 1-6 Random Experiments. Sample Spaces 

(1) Inspecting a lightbulb. S — [Defective, Nondefective). 

(2) Rolling a die. S = 1 1, 2. 3. 4, 5, 6). 

(3) Measuring tensile strength of wire. S the numbers in some interval. 

(4) Measuring copper content of brass. S: 50 % to 90%. say. 

(5) Counting daily traffic accidents in New York. 5 the integers in some interval. 

(6) Asking for opinion about a new car model. S = [Like, Dislike, Undecided). I 

The subsets of S are called events and the outcomes simple events. 

EXAMPLE 7 Events 

In (2), events are A = { I, 3, 5} ( “Odd number"), B = [2, 4, 6} ("Even number"), C = |5, 6}, etc. Simple 
events are { 1 ), {2}. • • • . {6). ■ 
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EXAMPLE 8 


If in a trial an outcome a happens and a E A (a is an element of A), we say that A happens. 
For instance, if a die turns up a 3, the event A: Odd number happens. Similarly, if C in 
Example 1 happens (meaning 5 or 6 turns up), then D = {4, 5, 6) happens. Also note that 
S happens in each trial, meaning that some event of S always happens. All this is quite natural. 

Unions, Intersections, Complements of Events 

In connection with basic probability laws we shall need the following concepts and facts 
about events (subsets) A, B, C, • • • of a given sample space S. 

The union A U B of A and B consists of all points in A or B or both. 

The intersection A fl B of A and B consists of all points that are in both A and B. 

If A and B have no points in common, we write 

A H B = 0 

where 0 is the empty set (set with no elements) and we call A and B mutually exclusive 
(or disjoint) because in a trial the occurrence of A excludes that of B (and conversely) — 
if your die turns up an odd number, it cannot turn up an even number in the same trial. 
Similarly, a coin cannot turn up Head and Tail at the same time. 

Complement A c of A. This is the set of all the points of S not in A. Thus, 

A n A c = 0, A U A c = S. 

In Example 7 we have A c = B, hence A U A c = { 1, 2, 3, 4, 5, 6} = S. 

Another notation for the complement of A is A (instead of A c ), but we shall not use this 
because in set theory A is used to denote the closure of A (not needed in our work). 
Unions and intersections of more events are defined similarly. The union 

m 

U A* = Ax U A 2 U • • • U A m 
S — i 

of events A 1? • • • , A m consists of all points that are in at least one Aj. Similarly for the 
union A x U A 2 U • • • of infinitely many subsets A lt A 2 , • • • of an infinite sample space 
S (that is, S consists of infinitely many points). The intersection 

m 

n Aj = Aj O A 2 0***0 A m 

of A lt * * * , A m consists of the points of S that are in each of these events. Similarly for 
the intersection A 1 O A 2 O * • • of infinitely many subsets of S. 

Working with events can be illustrated and facilitated by Venn diagrams 1 for showing 
unions, intersections, and complements, as in Figs. 509 and 510, which are typical 
examples that give the idea. 

Unions and Intersections of 3 Events 

In rolling a die, consider ihe evenis 

A: Number greater than 3. B: Number less than 6, C: Even number. 

Then A D B = {4, 5). B 0 C = (2. 4). C fl A = {4, 6), A fl B fl C - (4). Can you sketch a Venn diagram 
of this? Furthermore. A U B = 5, hence A U B U C = S (why?). H 


\J0HN VENN (1834-1923), English mathematician. 
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Fig. 509. Venn diagrams showing two events A and B in a sample space S 
and their union A U B (colored) and intersection A O B (colored) 



Fig. 510. Venn diagram for the experiment of rolling a die, showing 5, 
A = {I, 3, 5), C = (5,6),AUC = {1, 3, 5, 6}, A fl C = {5} 





SAMPLE SPACES, EVENTS 


Graph a sample space for the experiment: 


1. Tossing 2 coins 

2. Drawing 4 screws from a lot of right-handed and 
left-handed screws 

3. Rolling 2 dice 

4. Tossing a coin until the first Head appears 

5. Rolling a die until the first “Six ” appears 

6. Drawing bolts from a lot of 20, containing one 
defective £), until D is drawn, one at a time and 
assuming sampling without replacement, that is, 
bolts drawn are not returned to the lot 


7. Recording the lifetime of each of 3 lightbulbs 

8. Choosing a committee of 3 from a group of 5 people 

9. Recording the daily maximum temperature X and the 
maximum air pressure Y at some point in a city 


10. In Prob. 3, circle and mark the events A: Equal faces, 
B: Sum exceeds 9, C: Sum equals 7. 

11. In rolling 2 dice, are the events A: Sum divisible by 3 
and B: Sum divisible by 5 mutually exclusive? 

12. Answer the question in Prob. 1 1 for rolling 3 dice. 

13. In Prob. 5 list the outcomes that make up the event E: 
First “Six” in rolling at most 3 times. Describe E c . 

14. List all 8 subsets of the sample space S = [a, b, c). 


15-20 


VENN DIAGRAMS 


15. In connection with a trip to Europe by some students, 
consider the events P that they see Paris, G that they 
have a good time, and M that they run out of money, 
and describe in words the events 1, • • • , 7 in the 
diagram. 


G 



16. Using Venn diagrams, graph and check the rules 
A U (B fl C) = (A U B) O (A U C) 
a n (B u c) = (A n B) u (A n c). 


17. (De Morgan’s laws) Using Venn diagrams, graph and 
check De Morgan ’s laws 

(A U B) c = A c H B° 

(A H B) c = A c U B c . 
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18. Using a Venn diagram, show that A C B if and only if 
A H B = A. 

19. Show that, by die definition of complement, for any 
subset A of a sample space S , 


(A c ) c = A, S c = 0, 0 C = 5, 

A U A c = S, A O A c = 0. 

20. Using a Venn diagram, show that A C B if and only if 
A U B = 5. 


24.3 Probability 

The “probability” of an event A in an experiment is supposed to measure how frequently 
A is about to occur if we make many trials. If we flip a coin, then heads H and tails T 
will appear about equally often — we say that H and T are “equally likely.” Similarly, for 
a regularly shaped die of homogeneous material (“fair die”) each of the six outcomes 
1, • • • , 6 will be equally likely. These are examples of experiments in which the sample 
space S consists of finitely many outcomes (points) that for reasons of some symmetry 
can be regarded as equally likely. This suggests the following definition. 


DEFINITION 1 


First Definition of Probability 

If the sample space S of an experiment consists of finitely many outcomes (points) 
that are equally likely, then the probability P{A) of an event A is 


( 1 ) 


_ Number of points in A 
* ; Number of points in S 


From this definition it follows immediately that, in particular, 
(2) P(S) = 1. 


EXAMPLE 1 Fair Die 

In rolling a fair die once, what is the probability P(A) of A of obtaining a 5 or a 6? The probability of B: “ Even 
number "? 

Solution . The six outcomes are equally likely, so that each has probability 1/6. Thus P{A) = 2/6 = 1/3 
because A = {5, 6] has 2 points, and P(B) = 3/6 = 1/2. ■ 

Definition 1 takes care of many games as well as some practical applications, as we shall 
see, but certainly not of all experiments, simply because in many problems we do not 
have finitely many equally likely outcomes. To arrive at a more general definition of 
probability, we regard probability as the counterpart of relative frequency . Recall from 
Sec. 24. 1 that the absolute frequency f(A) of an event A in n trials is the number of times 
A occurs, and the relative frequency of A in these trials is f(A)/n; thus 

„ / . , fW Number of times A occurs 

/relW = 


( 3 ) 


n 


Number of trials 
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Now if A did not occur, then /(A) = 0. If A always occurred, then f(A) = n . These are 
the extreme cases. Division by n gives 

(4*) 0 ^/ re i(A)S 1. 

In particular, for A =5 we have f{S) = n because S always occurs (meaning that some 
event always occurs; if necessary, see Sec. 24.2, after Example 7). Division by n gives 

(5*) /reltf) = I- 

Finally, if A and B are mutually exclusive, they cannot occur together. Hence the absolute 
frequency of their union A U B must equal the sum of the absolute frequencies of A and 
B. Division by n gives the same relation for the relative frequencies, 

(6*) /rel(A u B) = / rel (A) + (A H B = 0). 

We are now ready to extend the definition of probability to experiments in which equally 
likely outcomes are not available. Of course, the extended definition should include 
Definition 1. Since probabilities are supposed to be the theoretical counterpart of relative 
frequencies, we choose the properties in (4*), (5*), (6*) as axioms. (Historically, such a 
choice is the result of a long process of gaining experience on what might be best and 
most practical.) 


General Definition of Probability 

Given a sample space S, with each event A of S (subset of S ) there is associated a 
number P(A), called the probability of A, such that the following axioms of 
probability are satisfied. 

1. For every A in S, 

(4) 0 S P(A) S 1. 

2. The entire sample space S has the probability 

(5) />($)=!. 

3. For mutually exclusive events A and B (A PI B = 0; see Sec. 24.2), 

(6) P(A U B) = P(A) + P(B) (A H B = 0). 

If S is infinite (has infinitely many points), Axiom 3 has to be replaced by 
3'. For mutually exclusive events A a , A 2 , * * ■ , 

(6') P(A 1 U A 2 U ■ • •) = P(A ± ) + P(A 2 ) + • • • . 


In the infinite case the subsets of S on which P(A) is defined are restricted to form a 
so-called cr-cilgebra , as explained in Ref. [GR6] (not [G6]!) in App. 1. This is of no 
practical consequence to us. 
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THEOREM 1 


PROOF 


EXAMPLE 2 


THEOREM 2 


EXAMPLE 3 


Basic Theorems of Probability 

We shall see that the axioms of probability will enable us to build up probability theory 
and its application to statistics. We begin with three basic theorems. The first of them is 
useful if we can get the probability of the complement A c more easily than P(A ) itself. 


Complementation Rule 

For an event A and its complement A c in a sample space S, 
(7) P(A°) = 1 - P{A). 


By the definition of complement (Sec. 24.2), we have S = A U A c and A H A c = 0. 
Hence by Axioms 2 and 3, 

1 = P(S) = P(A) + P(A C ), thus P(A C ) = 1 - P(A). ■ 


Coin Tossing 

Five coins are tossed simultaneously. Find the probability of the event A: At least one head turns up . Assume 
that the coins are fair. 

Solution, Since each coin can turn up heads or tails, the sample space consists of 2 5 = 32 outcomes. Since 
the coins are fair, we may assign the same probability (1/32) to each outcome. Then the event A c (No heads 
rum up) consists of only 1 outcome. Hence P(/\ c ) = 1/32, and the answer is P(A) = 1 - P(A C ) = 31/32. ■ 

The next theorem is a simple extension of Axiom 3, which you can readily prove by 
induction. 


Addition Rule for Mutually Exclusive Events 

For mutually exclusive events A x , • • • , A m in a sample space S , 

(8) P(A X U A 2 U • • • A m ) = P(A X ) + P(A 2 ) + • • • + P(A m ). 


Mutually Exclusive Events 

If the probability that on any workday a garage will get 10-20, 21-30, 31-40, over 40 cars to service is 0.20, 
0.35. 0.25, 0.12, respectively, what is the probability that on a given workday the garage gets at least 21 cars 
to service? 

Solution, Since these are mutually exclusive events, Theorem 2 gives the answer 0.35 + 0.25 + 0.12 = 0.72. 
Check this by the complementation rule. ■ 

In many cases, events will not be mutually exclusive. Then we have 


Addition Rule for Arbitrary Events 

For events A and B in a sample space r 

(9) P(A U B) = P(A) + P(B) - P(A n B). 


THEOREM 3 
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PROOF 


EXAMPLE 4 


THEOREM 4 


C, D , E in Fig. 5 1 1 make up A U B and are mutually exclusive (disjoint). Hence by 
Theorem 2. 

P(A U B) = P(Q 4- P(D) 4- P(E). 

This gives (9) because on the right P(C) 4- P(D) = P(A) by Axiom 3 and disjointness; 
and P(E) = P(B) - P(D) = P(B) — P(A D B ), also by Axiom 3 and disjointness. ■ 



A B 


Fig. 511. Proof of Theorem 3 

Note that for mutually exclusive events A and B we have A fl B = 0 by definition and, 
by comparing (9) and (6), 

( 10 ) P(0) = 0 . 

(Can you also prove this by (5) and (7)?) 

Union of Arbitrary Events 

In tossing a fair die. what is the probability of getting an odd number or a number less than 4? 

Solution . Let A be the event “Odd number " and B the event “ Number less than 4.” Then Theorem 3 gives 
the answer 

P(A U B) = | + | - | = § 

because A D B — "Odd number less than 4” = {I, 3}. B 

Conditional Probability. Independent Events 

Often it is required to find the probability of an event B under the condition that an event 
A occurs. This probability is called the conditional probability ofB given A and is denoted 
by P(B|A). In this case A serves as a new (reduced) sample space, and that probability is 
the fraction of P(A) which corresponds to A fl B. Thus 

, p(a n B) 

( 11 ) P(B\A ) = v [P(A) * 0 ]. 

Similarly, the conditional probability of A given B is 

. P(A D B) 

(12) P(A\B) = [P(B) * 0], 

r\£>) 

Solving (11) and (12) for P(A fl B ), we obtain 


Multiplication Rule 

If A and B are events in a sample space S and P(A) ^ 0, P(B) =£ 0, then 
( 13 ) P(A DB) = P(A)P(B\A) = P(B)P(A\B). 
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EXAMPLE S 


EXAMPLE 6 


Multiplication Rule 

In producing screws, let A mean "screw too slim" and B “screw too short." Lei P(A ) = 0. 1 and let the conditional 
probability that a slim screw is also too short be P(B\A) = 0.2. What is the probability that a screw that we pick 
randomly from the lot produced will be both too slim and too short? 

Solution . P(A n B) = P{A)P(B\A) = 0.1 • 0.2 = 0.02 = 2%. by Theorem 4. ■ 

Independent Events. If events A and B are such that 

(14) P(A HB) = P(A)P(B ), 


they are called independent events. Assuming P(A) =r= 0, P(B) 0, we see from (1 1)— (13) 

that in this case 

P(A\B) = P(A), P(B\A) = P(P). 

This means that the probability of A does not depend on the occurrence or nonoccurrence 
of P, and conversely. This justifies the term “independent.” 


Independence of m Events. Similarly, m events A x , • • • , A m are called independent if 

(15a) P(A X H • • • PI A m ) = P(Ai) • • • P(A m ) 

as well as for every k different events A^, Aj 2 , • • • , Aj k 

(15b) P(A dl DA h H • • ■ H A k ) = P{A h )P{A h ) • • • P(A jk ) 

where k = 2, 3, • • • , m — 1 . 

Accordingly, three events A , B, C are independent if and only if 

P(A DB) = P(A)P(B ), 

P(B fl C) = P(P)P(C), 

(16) 

P(C n A) = P(C)P(A), 

P(Ar\BnC) = P(A)P(B)P(C). 

Sampling. Oiu* next example has to do with randomly drawing objects, one at a time , 
from a given set of objects. This is called sampling from a population, and there are 
two ways of sampling, as follows. 

1. In sampling with replacement, the object that was drawn at random is placed back 
to the given set and the set is mixed thoroughly. Then we draw the next object at 
random. 

2. In sampling without replacement the object that was drawn is put aside. 

Sampling With and Without Replacement 

A box contains 10 screws, three of which are defective. Two screws are drawn at random. Find the probability 
that none of the two screws is defective. 

Solution . We consider the events 


A: First drawn screw nondefective. 

B: Second drawn screw nondefective. 
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Clearly. P(A) = ^ because 7 of ihe 10 screws are nondefective and we sample at random, so that each screw 
has the same probability of being picked. If we sample with replacement, the situation before the second 
drawing is the same as at the beginning, and P(B) = The events are independent, and the answer is 

P(A HB) = P{A)P(B) = 0.7 • 0.7 = 0.49 = 49%. 

If we sample without replacement, then P{A) = as before. If A has occurred, then there are 9 screws left 
in the box. 3 of which are defective. Thus P(B\A) = f = § , and Theorem 4 yields the answer 

P(A nfl) = ^*§~47%. 

Is it intuitively clear that this value must be smaller than die preceding one? M 


I P R O B L E M S E TE39I . 3E 


1. Three screws are drawn at random from a lot of 100 
screws, 10 of which are defective. Find the probability 
that the screws drawn will be nondefective in drawing 
(a) with replacement, (b) without replacement. 

2. In Prob. 1 find the probability of E: At least 1 defective 
(i) directly, (ii) by using complements; in both cases 
(a) and (b). 

3. If we inspect paper by drawing 5 sheets without 
replacement from every batch of 500, what is the 
probability of getting 5 clean sheets although 2% of 
the sheets contain spots ? First guess. 

4* Under what conditions will it make practically no 
difference whether we sample with or without 
replacement? Give numeric examples. 

5. If you need a right-handed screw from a box containing 
20 right-handed and 5 left-handed screws, what is the 
probability that you get at least one right-handed screw 
in drawing 2 screws with replacement? 

6. If in Prob. 5 you draw without replacement, does the 
probability decrease or increase? First think, then 
calculate. 

7. What gives the greater probability of hitting some target 
at least once: (a) hitting in a shot with probability 1/2 
and firing 1 shot, or (b) hitting in a shot with probability 
1/4 and firing 2 shots? First guess. Then calculate. 

8. Suppose that we draw cards repeatedly and with 
replacement from a file of 100 cards, 50 of which refer 
to male and 50 to female persons. What is the 
probability of obtaining the second “female” card 
before the third “male” card? 

9. What is the complementary event of the event 
considered in Prob. 8? Calculate its probability and use 
it to check your result in Prob. 8. 

10. In rolling two fair dice, what is the probability of 
obtaining a sum greater than 4 but not exceeding 7? 


11. in rolling two fair dice, what is the probability of 
obtaining equal numbers or numbers with an even 
product? 

12. Solve Prob. 1 1 by considering complements. 

13. A motor drives an electric generator. During a 30-day 
period, the motor needs repair with probability 8% and 
the generator needs repair with probability 4%. What 
is the probability that during a given period, the entire 
apparatus (consisting of a motor and a generator) will 
need repair? 

14. If a circuit contains 3 automatic switches and we want 
that, with a probability of 95%. during a given time 
interval they are all working, what probability of failure 
per time interval can we admit for a single switch? 

15. If a certain kind of tire has a life exceeding 25 000 miles 
with probability 0.95, what is the probability that a set of 
4 of these tires on a car will last longer than 25 000 miles? 

16. In Prob. 15, what is the probability that at least one of 
the tires will not last for 25 000 miles? 

17. A pressure control apparatus contains 4 valves. The 
apparatus will not work unless all valves are operative. 
If the probability of failure of each valve during some 
interval of time is 0.03, what is the corresponding 
probability of failure of the apparatus? 

18. Show that if B is a subset of A, then P(B) ^ P(A). 

19. Extending Theorem 4, show that 

P(A n B n C) = P(A)P(B\A)P(C\A n B). 

20. You may wonder whether in (16) the last relation 
follows from the others, but the answer is no. To see 
this, imagine that a chip is drawn from a box containing 
4 chips numbered 000, 011, 101, 110, and let A, B , C 
be the events that the first, second, and third digit, 
respectively, on the drawn chip is 1 . Show that then 
the first three formulas in (16) hold but the last one 
does not hold. 
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24.4 Permutations and Combinations 

Permutations and combinations help in finding probabilities P(A) = a!k by systematically 
counting the number a of points of which an event A consists; here, k is the number of 
points of the sample space S. The practical difficulty is that a may often be surprisingly 
large, so that actual counting becomes hopeless. For example, if in assembling some 
instrument you need 10 different screws in a certain order and you want to draw them 
randomly from a box (which contains nothing else) the probability of obtaining them in 
the required order is only 1/3 628 800 because there are 


10! = 1 ■ 2 • 3 • 4 • 5 • 6 • 7 • 8 • 9 • 10 = 3 628 800 


orders in which they can be drawn. Similarly, in many other situations the numbers of 
orders, arrangements, etc. are often incredibly large. (If you are unimpressed, take 20 
screws — how much bigger will the number be?) 


Permutations 

A permutation of given things {elements or objects) is an arrangement of these things in 
a row in some order. For example, for three letters a, b , c there are 3! = 1 • 2 • 3 = 6 
permutations: abc , cicb , bac , bca , cab , cba . This illustrates (a) in the following theorem. 


THEOREM 1 


Permutations 

(a) Different things. The number of permutations of n different things taken 
all at a time is 

(1) n\ = 1 • 2 • 3 - • • /? (read “n factorial”). 


(b) Classes of equal things . If n given things can be divided into c classes of 
alike things differing from class to class , then the number of permutations of 
these things taken all at a time is 


( 2 ) 


n\ 

n^.n^. * * • n c \ 


(«i + n 2 + * ' * 4- n c = n) 


where nj is the number of things in the jth class . 


PROOF (a) There are n choices for filling the first place in the row. Then n — 1 things are still 
available for filling the second place, etc. 

(b) alike things in class 1 make n±\ permutations collapse into a single permutation 
(those in which class 1 things occupy the same n x positions), etc., so that (2) follows 
from (1). g 
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EXAMPLE 1 


THEOREM 2 


EXAMPLE 2 


Illustration of Theorem 1(b) 

If a box contains 6 red and 4 blue balls, the probability of drawing first the red and then the blue balls is 

P = 6141/10! = 1/210 « 0.5%. ■ 

A permutation of n things taken k at a time is a permutation containing only k of the 
n given things. Two such permutations consisting of the same k elements, in a different 
order, are different, by definition. For example, there are 6 different permutations of the 
three letters a , /;, c, taken two letters at a time, ab , ac , be , ba , ca , cb . 

A permutation of n things taken & at a time with repetitions is an arrangement 
obtained by putting any given thing in the first position, any given thing, including a 
repetition of the one just used, in the second, and continuing until k positions are filled. 
For example, there are 3 2 = 9 different such permutations of a , b y c taken 2 letters at a 
time, namely, the preceding 6 permutations and aa , bb , cc. You may prove (see Team 
Project 18): 


Permutations 

The number of different petmutations of n different things taken kata time without 
repetitions is 

(3a) n(n - 1 )(n - 2) • • • (n - k + 1) = — ' 

(n - k)\ 

and with repetitions is 

(3b) n k . 


Illustration of Theorem 2 

In a coded telegram the letters are arranged in groups of five letters, called words. From (3b) we see that the 
number of different such words is 

26 s = 1 1 881 376. 

From (3a) it follows that the number of different such words containing each letter no more than once is 


261/(26 - 5)1 = 26 • 25 • 24 • 23 • 22 = 7 893 600. 


Combinations 

In a permutation, the order of the selected things is essential. In contrast, a combination 
of given things means any selection of one or more things without regard to order. There 
are two kinds of combinations, as follows. 

The number of combinations of n different things, taken k at a time, without 
repetitions is the number of sets that can be made up from the n given things, each set 
containing k different things and no two sets containing exactly the same k things. 

The number of combinations of n different things, taken k at a time, with repetitions 
is the number of sets that can be made up of k things chosen from the given n things, 
each being used as often as desired. 
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THEOREM 3 


PROOF 


EXAMPLE 3 


For example, there are three combinations of the three letters a , b , c 9 taken two letters 
at a time, without repetitions, namely, ab , ac , to, and six such combinations with 
repetitions, namely, ab, ac , to, aa, to, cc. 


Combinations 

The number of different combinations ofn different things taken , k at a time , without 
repetitions, is 


(4a) 


nl 


n(n — 1) • * * (n — k + 1) 


k!(n-k)l 1-2 

the number of those combinations with repetitions is 


(4b) 


CD 


The statement involving (4a) follows from the first part of Theorem 2 by noting that there 
are k! permutations of k things from the given n things that differ by the order of the 
elements (see Theorem 1), but there is only a single combination of those k things of the 
type characterized in the first statement of Theorem 3. The last statement of Theorem 3 
can be proved by induction (see Team Project 1 8). ■ 


Illustration of Theorem 3 


The number of samples of five lightbulbs that can be selected from a lot of 500 bulbs is [see (4a)J 


/500\ _ 500! 

\ 5 j ~ 5 ! 495 ! 


500 -499 -498 -497 -496 
l -2- 3-4-5 


255 244 687 600. ■ 


Factorial Function 

In ( 1 )-(4) the factorial function is basic. By definition, 

(5) 0! = 1. 

Values may be computed recursively from given values by 

(6) (n + 1)! = (/» + 1)«!. 

For large n the function is very large (see Table A3 in App. 5). A convenient approximation 
for large n is the Stirling formula 2 


(7) 


n\ 


~ V27771 



(e = 2.718 • • •) 


2 JAMES STIRLING (1692-1770). Scots mathematician. 
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EXAMPLE 4 


where — is read “asymptotically equal” and means that the ratio of the two sides of (7) 
approaches 1 as n approaches infinity. 

Stirling Formula 


n! 

By (7) 

Exact Value 

Relative Error 

4! 

23.5 

24 

2.1% 

10! 

3 598 696 

3 628 800 

0.8% 

20! 

2.422 79 • 10 18 

2 432 902 008 176 640 000 

0.4% 


Binomial Coefficients 

The binomial coefficients are defined by the formula 

a\ a(a — l)(a — 2) • • • (a — k + 1) 


( 8 ) 


k\ 


(k = 0, integer). 


The numerator has k factors. Furthermore, we define 

(9) =1, in particular, 

For integer a = n we obtain from (8) 

( 10 ) 

Binomial coefficients may be computed recursively, because 

( 11 ) 


= 1 . 


CM-'.) 


(n§ 0,0glS n). 


(H-K::) 


Formula (8) also yields 

( 12 ) 

There are numerous further relations; we mention two important ones. 


(■;)=<->* ( m+ r') 


(k ^ 0, integer). 


( k = 0, integer) 
(m > 0 ). 


(13) 


( * 2 0, n £ 1, 
both integer) 

and 



(14) 

j(5 (-*)-("') 

(r g 0, integer). 
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1. List all permutations of four digits 1, 2, 3, 4, taken all 
at a time. 

2. List (a) all permutations, (b) all combinations without 
repetitions, (c) all combinations with repetitions, of 5 
letters a , e , /, o , u taken 2 at a time. 

3. In how many ways can we assign 8 workers to 8 jobs 
(one worker to each job and conversely)? 

4. How many samples of 4 objects can be drawn from a 
lot of 80 objects? 

5. In how many different ways can we choose a 
committee of 3 from 20 persons? First guess. 

6. In how many different ways can we select a committee 
consisting of 3 engineers, 2 biologists, and 2 chemists 
from 10 engineers, 5 biologists, and 6 chemists? First 
guess. 

7. Of a lot of 1 0 items, 2 are defective, (a) Find the number 
of different samples of 4. Find the number of samples 
of 4 containing (b) no defectives, (c) 1 defective, (d) 2 
defectives. 

8. If a cage contains 100 mice, two of which are male, 
what is the probability that the two male mice will be 
included if 12 mice are randomly selected? 

9. An urn contains 2 blue, 3 green, and 4 red balls. We 
draw 1 ball at random and put it aside. Then we draw 
the next ball, and so on. Find the probability of drawing 
at first the 2 blue balls, then the 3 green ones, and 
finally the red ones. 

10. By what factor is the probability in Prob. 9 decreased 
if the number of balls is doubled (4 blue, etc.)? 

11. Determine the number of different bridge hands. (A 
bridge hand consists of 13 cards selected from a full 
deck of 52 cards.) 

12. In how many different ways can 5 people be seated at 
a round table? 

13. If 3 suspects who committed a burglary and 6 innocent 
persons are lined up, what is the probability that a 
witness who is not sure and has to pick three persons 
will pick the three suspects by chance? That the witness 
picks 3 innocent persons by chance? 


14. (Birthday problem) What is die probability that in a 
group of 20 people (that includes no twins) at least two 
have the same birthday, if we assume that the 
probability of having birthday on a given day is 1/365 
for every day. First guess. 

15. How many different license plates showing 5 symbols, 
namely, 2 letters followed by 3 digits, could be made? 

16. How many automobile registrations may the police 
have to check in a hit-and-run accident if a witness 
reports KDP5 and cannot remember the last two digits 
on the license plate but is certain that all three digits 
were different? 

17. CAS PROJECT. Stirling formula, (a) Using (7), 
compute approximate values of n\ for n = 1, • • • , 20. 

(b) Determine the relative error in (a). Find an 
empirical formula for that relative error. 

(c) An upper bound for that relative error is e l}i2n — 1 . 
Try to relate your empirical formula to this. 

(d) Search through the literature for further 
information on Stirling’s formula. Write a short report 
about your findings, arranged in logical order and 
illustrated with numeric examples. 

18. TEAM PROJECT. Permutations, Combinations. 

(a) Prove Theorem 2. 

(b) Prove the last statement of Theorem 3. 

(c) Derive (II) from (8). 

(d) By the binomial theorem, 

(a + b) n = 2 a k b n ~ k , 

so that ci k b n ~ k has the coefficient (/!). Can you 
conclude this from Theorem 3 or is this a mere 
coincidence? 

(e) Prove (14) by using the binomial theorem. 

(f) Collect further formulas for binomial coefficients 
from the literature and illustrate them numerically. 


24 .! Random Variables. 

Probability Distributions 

In Sec. 24. 1 we considered frequency distributions of data. These distributions show the 
absolute or relative frequency of the data values. Similarly, a probability distribution 
or, briefly, a distribution, shows the probabilities of events in an experiment. The quantity 
that we observe in an experiment will be denoted by X and called a random variable (or 
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stochastic variable) because the value it will assume in the next trial depends on chance, 
on randomness — if you roll a dice, you get one of the numbers from 1 to 6, but you don’t 
know which one will show up next. Thus X — Number a die turns up is a random variable. 
So is X = Elasticity of rubber (elongation at break). (“Stochastic” means related to chance.) 

If we count (cars on a road, defective screws in a production, tosses until a die shows 
the first Six), we have a discrete random variable and distribution. If we measure 
(electric voltage, rainfall, hardness of steel), we have a continuous random variable and 
distribution. Precise definitions follow. In both cases the distribution of X is determined 
by the distribution function 

(1) F(x) = P(X* x); 

this is the probability that in a trial, X will assume any value not exceeding x. 

CAUTION! The terminology is not uniform. F(x) is sometimes also called the 

cumulative distribution function. 

For (1) to make sense in both the discrete and the continuous case we formulate 
conditions as follows. 


DEFINITION 


Random Variable 

A random variable X is a function defined on the sample space S of an experiment. 
Its values are real numbers. For every number a the probability 

P(X = a) 

with which X assumes a is defined. Similarly, for any interval I the probability 

P(X G 7) 

with which X assumes any value in 1 is defined. 


Although this definition is very general, practically only a very small number of 
distributions will occur over and over again in applications. 

From (l) we obtain the fundamental formula for the probability corresponding to an 
interval a < x ^ b. 


( 2 ) 


P(a < X ^ b) = F{b) ~ F(a). 


This follows because X a {“X assumes any value not exceeding a”) and a < X ^ b 
(" X assumes any value in the interval a < x ^ b”) are mutually exclusive events, so that 
by (1) and Axiom 3 of Definition 2 in Sec. 24.3 

F(b) = P(X ^ b) = P(X ^ a) + P(a<X^b) 

= F(a) + P(a<X^b) 


and subtraction of F(a) on both sides gives (2). 




1012 


CHAP. 24 Data Analysis. Probability Theory 


EXAMPLE 1 


Discrete Random Variables and Distributions 

By definition, a random variable X and its distribution are discrete if X assumes only 
finitely many or at most countably many values * lf a 2 > x 3 , • • • , called the possible values 
of X, with positive probabilities p j = P{X = a x ), p 2 = P(X = x 2 ), p z = P(X = x 3 ), ■ • ■ , 
whereas the probability P(X E /) is zero for any interval 1 containing no possible value. 

Clearly, the discrete distribution of X is also determined by the probability function 
f(x) of X, defined by 


( 3 ) 


m = 


f 


otherwise 


U = l, 2, • • •), 


From this we get the values of the distribution function F( x) by taking sums. 


( 4 ) 


F(x) = 2 f(*j) = 2 Pj 

XjSX Xj^x 


where for any given x we sum all the probabilities p 3 - for which is smaller than or equal 
to that of x. This is a step function with upward jumps of size pj at the possible values 
Xj of X and constant in between. 

Probability Function and Distribution Function 

Figure 512 shows the probability function f(x ) and the distribution function Fix) of the discrete random variable 

X = Number a fair die turns up . 

X has the possible values a* = 1, 2, 3, 4, 5, 6 with probability 1/6 each. At these a the distribution function has 
upward jumps of magnitude 1/6. Hence from the graph of /(a) we can construct the graph of F( a), and conversely. 
Tn Figure 512 (and the next one) at each jump the fat dot indicates th e function value at the jump! ■ 


fix) | 

y 6 



fix) 

‘111111.. 

Ve- 


i i 


i i I l 


! ! ! i I 

10 12 


Fix) 



Fig. 512. Probability function f(x) 
and distribution function F(x ) of the 
random variable X = Number 
obtained in tossing a fair die once 



Fig. 513. Probability function f[x) and 
distribution function F(x) of the random 
variable X = Sum of the two numbers 
obtained in tossing two fair dice once 
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EXAMPLE 2 


EXAMPLE 3 


EXAMPLE 4 


Probability Function and Distribution Function 

The random variable X = Sum of the two numbers two fair dice turn up is discrete and has the possible values 
2 (= 1 + I), 3, 4. • • « , 12 (= 6 4- 6). There are 6 • 6 = 36 equally likely outcomes (I, l) (1, 2), • • ■ . (6, 6), 
where the first number is that shown on the first die and the second number that on the other die. Each such 
outcome has probability 1/36. Now X = 2 occurs in the case of the outcome (1. 1); X = 3 in the case of the 
two outcomes ( 1 , 2) and (2, 1 ); X = 4 in the case of the three outcomes (1,3), (2, 2), (3 t 1 ); and so on. Hence 
fix) = P(X = x) and F(x) = P(X = x) have the values 


X 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Six) 

1/36 

2/36 

3/36 

4/36 

5/36 

6/36 

5/36 

4/36 

3/36 

2/36 

1/36 

Fix) 

1/36 

3/36 

6/36 

10/36 

15/36 

21/36 

26/36 

30/36 

33/36 

35/36 

36/36 


Figure 513 shows a bar chart of this function and the graph of the distribution function, which is again a step 
function, with jumps (of different height!) at the possible values of X. H 


Two useful formulas for discrete distributions are readily obtained as follows. For the 
probability corresponding to intervals we have from (2) and (4) 


(5) P(a < X ^ b) = F(b) — F(a) = X Pj (X discrete). 

a<xj^b 

This is the sum of all probabilities pj for which x j satisfies a < Xj ^ b . (Be careful about 
< and ^!) From this and P{S) = 1 (Sec. 24.3) we obtain the following formula. 

(6) X Pj = 1 (sum of all probabilities). 

j 

Illustration of Formula (5) 

In Example 2. compute the probability of a sum of at least 4 and at most 8. 

Solution. PI 3 <XS8) = F( 8) - FO) = n - = §§■ * 

Waiting Time Problem. Countably infinite Sample Space 

In tossing a fair coin, let X = Number of trials until the first head appears. Then, by independence of events 
(Sec. 24.3). 

P(X = I) = P(H) =| (H = Head) 

P(X=2) = P(TH) = H =} (7 = Tail) 

P(X = 3) = POTff) = = g. etc. 

and in general P(X = n ) = (£)", n = 1 . 2. • • • . Also, (6) can be confirmed by the sum formula for the geometric 
series, 



= - 1 + 2 = 1 . 
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Continuous Random Variables and Distributions 

Discrete random variables appear in experiments in which we count (defectives in a 
production, days of sunshine in Chicago, customers standing in a line, etc.). Continuous 
random variables appear in experiments in which we measure (lengths of screws, voltage 
in a power line, Brinell hardness of steel, etc.). By definition, a random variable X and 
its distribution are of continuous type or, briefly, continuous, if its distribution function 
F(x) [defined in (1)] can be given by an integral 


(7) 


F(x) = 



dv 


(we write v because x is needed as the upper limit of the integral) whose integrand /( x), 
called die density of the distribution, is nonnegative, and is continuous, perhaps except 
for finitely many rvalues. Differentiation gives the relation of / to F as 

(8) f(x) = F'(x) 

for every x at which fix) is continuous. 

From (2) and (7) we obtain the very important formula for the probability corresponding 
to an interval: 


(9) Pia 

This is the analog of (5). 
From (7) and P(S) = 1 

( 10 ) 


< X ^ b) = Fib) - Fid) = \ fiu) dv. 

(Sec. 24.3) we also have the analog of (6): 

[ fiv) dv = 1. 

J — 00 


Continuous random variables are simpler than discrete ones with respect to intervals. 
Indeed, in die continuous case the four probabilities corresponding to a < X ^ b, 
a < X < b, a ^ X < b, and a ^ X ^ b with any fixed a and b (> a) are all the same. 
Can you see why? iAnswer. This probability is the area under the density curve, as in 
Fig. 514, and does not change by adding or subtracting a single point in the interval of 
integration.) This is different from the discrete case! (Explain.) 

The next example illustrates notations and typical applications of our present 
formulas. 


Curve of density 



Fig. 514. Example illustrating formula (9) 
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EXAMPLE 5 Continuous Distribution 

Let X have the density function /( x) = 0.75(1 - .v 1 2 ) if - 1 ^ x 1 and zero otherwise. Find the distribution 
function. Find the probabilities P(-% = X ^ and P(\ ^ X ^ 2). Find x such that P(X = x) — 0.95. 

Solution . From (7) we obtain F(x) = 0 if .v ^ — 1, 

F(.v) = 0.75 J (1 - v 2 ) dv = 0.5 + 0.75a- - 0.25 a 3 4 5 6 7 if - 1 < x ^ I. 

-l 

and Fix) = 1 if a > l. From this and (9) we get 

f 1/2 

/>(-§ ^ X ^ h) = F(%) ~ F(~k) = 0.75 I (1 - u 2 ) db = 68.75% 

- 1/2 

(because P{-\^X^%) = /*(-£ < X = §) for a continuous distribution) and 

P(± ^ X ^ 2) = F(2) - F(|) = 0.75 f (1 - u 2 ) dv = 31.64%. 

J If 4 

(Note that the upper limit of integration is 1, not 2. Why?) Finally, 

P(X ^ a) = F(x) = 0.5 + 0.75a - 0.25a 3 = 0.95. 

Algebraic simplification gives 3a - a 3 = 1 .8. A solution is a = 0.73, approximately. 

Sketch f(x) and mark a = \, and 0.73, so that you can see the results (the probabilities) as areas under 

the curve. Sketch also Fix). U 

Further examples of continuous distributions are included in the next problem set and in 
later sections. 






1. Graph die probability function /(a) = kx 2 

(a = l, 2, 3, 4, 5; k suitable) and the distribution 
function. 

2. Graph the density function /(a) = kx 2 (0 ^ a ^ 5; 
k suitable) and the distribution function. 

3. (Uniform distribution) Graph / and F when the 
density is /(a) = k = const if —4 ^ a ^ 4 and 0 
elsewhere. 

4. In Prob. 3 find P(0 ^ a ^ 4) and c such that 
P(-c < X < c) = 95%. 

5. Graph / and F when /(- 2) = /( 2) = 1/8, 

/(-I) = /(l) = 3/8. Can / have further positive 
values? 

6. Graph the distribution function F(x) - 1 — e -3 * if 
a > 0, F(x) = 0 if a ^ 0, and the density /(a). Find a 
such that F(x) — 0.9. 

7. Let X be the number of years before a particular type 
of machine will need replacement. Assume that X has 
the probability function /(l) = 0.1, /( 2) = 0.2, 
/(3) = 0.2, /(4) = 0.2, /(5) = 0.3. Graph f and F. 
Find the probability that die machine needs no 


replacement during the first 3 years. 

8. If X has the probability funcdon /(a) = k/2 x 
(a = 0, 1, 2, • • •), what are k and P(X ^ 4)? 

9. Find the probability that none of the three bulbs in 
a traffic signal must be replaced during the first 1 200 
hours of operation if the probability that a bulb must 
be replaced is a random variable X with density 
f{x) = 6[0.25 - (x ~ 1.5) 2 ] when 1 S x 5 2 and 
/(a) = 0 otherwise, where x is time measured in 
multiples of 1000 hours. 

10. Suppose that certain bolts have length L = 200 4- X mm, 
where X is a random variable with density 
fix) = |(1 — a 2 ) if —1 ^ x = I and 0 otherwise. 
Determine c so that with a probability of 95% a bolt 
will have any length between 200 - c and 200 + c. 
Hint: See also Example 5. 

11. Let X [millimeters] be the thickness of washers a 
machine turns out. Assume that X has the density 
fix) = kx if 1.9 < a < 2.1 and 0 otherwise. Find k. 
What is the probability that a washer will have 
thickness between 1.95 mm and 2.05 mm? 
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12. Suppose that in an automatic process of filling oil 
into cans, the content of a can (in gallons) is 
Y = 50 + X, where X is a random variable with 
density fix) = 1 - |.v| when |.v| ^ 1 and 0 when 
|.v| > 1 . Graph fix) and Fix). In a lot of 1 00 cans, about 
how many will contain 50 gallons or more? What is 
the probability that a can will contain less than 49.5 
gallons? Less than 49 gallons? 

13. Let the random variable X with density fix) = ke~ x if 
0 ^ x ^ 2 and 0 otherwise (a* = time measured in 
years) be the tune after which certain ball bearings are 
worn out. Find k and the probability that a bearing will 
last at least l year. 

14. Let X be the ratio of sales to profits of some firm. 
Assume that X has the distribution function F(x) = 0 
if x < 2, F(x) = (a - 2 - 4)/5 if 2 § .v < 3, F(x) = 1 if 
.v ^ 3. Find and graph the density. What is the probability 


that X is between 2.5 (40% profit) and 5 (20% profit)? 

15. Show that b<c implies P(X ^ b) ^ P(X ^ c). 

16. If the diameter X of axles has the density fix) = k if 
119.9 ^ jc = 120.1 and 0 otherwise, how many 
defectives will a lot of 500 axles approximately contain 
if defectives are axles slimmer than 1 19.92 or thicker 
than 120.08? 

17. Let X be a random variable that can assume every real 
value. What are the complements of the events X ^ b. 
X<b 1 X^c,X>t\b^X^cJ)<X^ c? 

18. A box contains 4 right-handed and 6 left-handed 
screws. Two screws are drawn at random without 
replacement. Let X be the number of left-handed screws 
drawn. Find die probabilities P(X = 0). P{X = 1), 
P{X = 2), P(1 < X < 2), P(X ^ 1), PiX ^ 1), 
P(X> 1), and P(0.5 < X < 10). 


24.6 Mean and Variance of a Distribution 

The mean /z and variance cr 2 of a random variable X and of its distribution are the theoretical 
counterparts of the mean x and variance s 2 of a frequency distribution in Sec. 24.1 and 
serve a similar purpose. Indeed, the mean characterizes the central location and the variance 
the spread (the variability) of the distribution. The mean jjl (mu) is defined by 


(1) 

(a) 

/Z 2) Xjf{Xj) 

.7 

(Discrete distribution) 


(b) 

M = J xfU) dx 

— cc 

(Continuous distribution) 


and the variance cr 2 (sigma square) by 

(a) cr 2 = 2 (Xj — fiffiXj) (Discrete distribution) 

(2) r 

(b) cr 2 =1 (x — fiffix) dx (Continuous distribution). 

— oc 

<j(the positive square root of cr 2 ) is called the standard deviation of X and its distribution. 
/ is the probability function or the density, respectively, in (a) and (b). 

The mean /z is also denoted by £(X) and is called the expectation ofX because it gives 
the average value of X to be expected in many trials. Quantities such as /z and cr 2 that 
measure certain properties of a distribution are called parameters, /z and cr 2 are the two 
most important ones. From (2) we see that 

(3) cr 2 > 0 

(except for a discrete “distribution” with only one possible value, so that a 2 = 0). We 
assume that /z and a 2 exist (are finite), as is the case for practically all distributions that 
are useful in applications. 
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EXAMPLE 1 


EXAMPLE 2 


THEOREM 1 


Mean and Variance 

The random variable X = Number of heads in a single loss of a fair coin has the possible values X = 0 and X = I 
with probabilities P(X = 0) = \ and P(X = 1) = ^ From (la) we thus obtain the mean fx = 0 + I and 

(2a) yields the variance 

<T 2 = (o-i) 2 4 + (i-!> 2 -£=4' ■ 


Uniform Distribution. Variance Measures Spread 

The distribution with the density 

fix) = — — — if a < x < b 

b — a 


and / = 0 otherwise is called the uniform distribution on the interval a < x < b. From (lb) (or from Theorem 
I. below) we find that fx = (a + b)f2, and (2b) yields the variance 

- af 




Figure 515 illustrates that the spread is large if and only if cr is large. 



Symmetry. We can obtain the mean p without calculation if a distribution is symmetric. 
Indeed, you may prove 


Mean of a Symmetric Distribution 

If a distribution is symmetric with respect to x = c, that is, f(c — a*) = f(c + a), 
then p = c. (Examples 1 and 2 illustrate this.) 


Transformation of Mean and Variance 

Given a random variable X with mean p and variance a 2 , we want to calculate the mean 
and variance of X* = a x 4* a 2 X , where a 1 and a 2 are given constants. This problem is 
important in statistics, where it appears often. 
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THEOREM 2 


PROOF 


Transformation of Mean and Variance 

(a) If a random variable X has mean fx and variance cr 2 , then the random 
variable 

(4) X* = a x + * 2 X (a 2 > 0) 

has the mean /x* and variance cr* 2 , where 

(5) /x* = a x + a 2 fx and cr* 2 = a 2 2 cr 2 . 


(b) In particular ; the standardized random variable Z corresponding to X, 
given by 


( 6 ) 


Z = 


X- fJL 

cr 


has the mean 0 and the variance 1 . 


We prove (5) for a continuous distribution. To a small interval I of length Ax on the 
x-axis there corresponds the probability /(x)Ax [approximately; the area of a rectangle of 
base Ax and height /(x)]. Then the probability /(x)Ax must equal that for the corresponding 
interval on the x*-axis, that is, /*(x*)Ax*, where /* is the density of X* and Ax* is the 
length of the interval on the x*-axis corresponding to I. Hence for differentials we have 
f*(x*) dx* = f(x) dx. Also, x* = a x 4- a 2 x by (4), so that (lb) applied to X* gives 


/x* = f x*f*(x*) dx * 

J -oo 

= [ (#i + a 2 x)f(x) dx 

— oc 

r r* 

= fli J f( x) dx + a 2 j xf(x) dx. 


On the right the first integral equals 1, by (10) in Sec. 24.5. The second integral is /a. This 
proves (5) for /a*. It implies 

x* - ix* = (a t + a 2 x) - (a x + a 2 /J.) = a 2 (x - /a). 

From this and (2) applied to X again using /*(a*) dx* = f(x) dx, we obtain the second 
formula in (5), 

* oc 

or * 2 = J (x* - fi *) z f*(x*) dx* = a 2 I (a - /a) 2 /(a) dx = a 2 a 2 . 

-00 -CO 

For a discrete distribution the proof of (5) is similar. 

Choosing a 1 = -pJa and a 2 = Mar we obtain (6) from (4), writing X * = Z. For these 
«2 formula (5) gives /a* = 0 and a* 2 = 1, as claimed in (b). ■ 
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Expectation, Moments 

Recall that (1) defines the expectation (the mean) of X , the value of X to be expected on 
the average, written jx = E(X). More generally, if g(x) is nonconstant and continuous for 
all x , then g(X) is a random variable. Hence its mathematical expectation or, briefly, its 
expectation E(g(X)) is the value of g(X) to be expected on the average, defined [similarly 
to (1)] by 

(7) E(g(X)) = 2 or E(g(X)) = J g(x)f(x) dx. 

J “°° 

In the first formula, / is the probability function of the discrete random variable X . In the 
second formula, / is the density of the continuous random variable X. Important special 
cases are the Zcth moment of X (where k = 1, 2, • • •) 

r 00 

(8) E(X k ) = 2 xffiXj) or I x k f(x)dx 

i 

and the Ath central moment of X (A = 1, 2, • • •) 

(9) EQX - /x] fc ) = 2 C* - V-) k f( Xj ) or / (x - n) k f(x) dx. 

j 

This includes the first moment, the mean of X 

(10) lx = E(X) [(8) with A = 1], 

It also includes the second central moment, the variance of X 

(11) o- 2 = E([X - ixf) [(9) with k = 2]. 

For later use you may prove 

(12) E( 1) = 1. 





1-6 


MEAN, VARIANCE 


Find the mean and the variance of the random variable X 
with probability function or density f(x). 


1. f(x) = 2x (O^x^l) 

2. /( 0) = 0.512, /(l) = 0.384, /( 2) = 0.096, 
/( 3) = 0.008 

3. X = Number a fair die turns up 

4. Y = —AX + 5 with X as in Prob. 1 

5. Uniform distribution on [0, 8] 

6. /( x) = 2e~ 2x (x ^ 0) 


7. What is the expected daily profit if a store sells X air 
conditioners per day with probability /(10) = 0.1, 
/( 11) = 0.3, /(12) = 0.4, /(13) = 0.2 and the profit 
per conditioner is $55? 

8. What is the mean 1 ife of a light bulb whose life X [hours] 
has the density f(x) = 0.001^~ o oola? (jc ^ 0)? 

9. If the mileage (in multiples of 1000 mi) after which a tire 
must be replaced is given by the random variable X with 
density f(x) = Be~ 6x (x > 0), what mileage can you 
expect to get on one of these tires? Let 6 = 0.04 and find 
the probability that a tire will last at least 40000 mi. 
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10. What sum can you expect in rolling a fair die 10 times? 
Do it. Repeat this experiment 20 times and record how 
the sum varies. 

11. A small filling station is supplied with gasoline every 
Saturday afternoon. Assume that its volume X of sales 
in ten thousands of gallons has the probability density 
/(a) = 6.v( 1 - a) if 0 = .y = 1 and 0 otherwise. 
Determine the mean, the variance, and the standardized 
variable. 

12. What capacity must the tank in Prob. 1 1 have in order 
that the probability that the tank will be emptied in a 
given week be 5%? 

13. Let X [cm] be the diameter of bolts in a production. 
Assume that X has the density 

/(a) = k(x - 0.9)(1.1 - x) if 0.9 < x < 1.1 and 0 
otherwise. Determine k< sketch /(a), and find jx and o 2 . 

14. Suppose that in Prob. 13, a bolt is regarded as being 
defective if its diameter deviates from 1 .00 cm by more 
than 0.09 cm. What percentage of defective bolts 
should we then expect? 

15. For what choice of the maximum possible deviation c 
from 1.00 cm shall we obtain 3% defectives in Probs. 
13 and 14? 


16. TEAM PROJECT. Means, Variances, Expectations. 

(a) Show that£(X - ju.) = 0, <r 2 = E(X Z ) - ft 2 . 

(b) Prove (1 0)-( 12). 

(c) Find all the moments of the uniform distribution 
on an interval a ^ a * ^ b. 

(d) The skewness y of a random variable X is defined 
by 

(13) 7=-^ E([X - j/.] 3 ). 


Show that for a symmetric distribution (whose third 
central moment exists) the skewness is zero. 

(e) Find the skewness of the distribution with density 
/(a) = xe~ x when x > 0 and f(x) = 0 otherwise. 
Sketch /(a*). 

(f) Calculate the skewness of a few simple discrete 
distributions of your own choice. 

(g) Find a nonsymmetric discrete distribution with 3 
possible values, mean 0, and skewness 0. 


24 .j Binomial, Poisson, and Hypergeometric 
Distributions 

These are the three most important discrete distributions, with numerous applications. 


Binomial Distribution 

The binomial distribution occurs in games of chance (rolling a die, see below, etc.), 
quality inspection (e.g., counting of the number of defectives), opinion polls (counting 
number of employees favoring certain schedule changes, etc.), medicine (e.g., recording 
the number of patients recovered by a new medication), and so on. The conditions of its 
occurrence are as follows. 

We are interested in the number of times an event A occurs in n independent trials. In 
each trial the event A has the same probability P(A) = p. Then in a trial, A will not occur 
with probability q = 1 - p. In n trials the random variable that interests us is 

X = Number of times the event A occurs in n trials. 

X can assume the values 0, 1 , ••*,/?, and we want to determine the corresponding 
probabilities. Now X = x means that A occurs in x trials and in n - x trials it does not 
occur. This may look as follows. 



SEC 24.7 Binomial, Poisson, and Hypergeometric Distributions 


1021 


A A- - A B B • • • B. 

_ j *- j 

( 1 ) 

x times /? — x times 

Here B = A° is the complement of A, meaning that A does not occur (Sec. 24.2). We now 
use the assumption that the trials are independent, that is, they do not influence each other. 
Hence ( I ) has the probability (see Sec. 24.3 on independent events) 

PP 'P • C/Cf ■ q= p x q n ~ x . 

(l*) ' — 7 T — ' 

.v times n — x times 

Now (1) is just one order of arranging a* A’s and n — x B's. We now use Theorem 1(b) 
in Sec. 24.4, which gives the number of permutations of n things (the n outcomes of the 
n trials) consisting of 2 classes, class 1 containing the /Zj = x A’s and class 2 containing 
the /7 — /?!=/? — x B's. This number is 

a!(/i-a)! \x) 

Accordingly, (1*) multiplied by this binomial coefficient gives the probability P(X = a) of 
X = a, that is, of obtaining A precisely x times in n trials. Hence X has the probability function 

(2) fix) = (") p x q n ~* (a- = 0, I. •••,«) 


and /(a) = 0 otherwise. The distribution of X with probability function (2) is called the 
binomial distribution or Bernoulli distribution. The occurrence of A is called success 
(regardless of what it actually is; it may mean that you miss your plane or lose your watch) 
and the nonoccurrence of A is called/h/7«/*e\ Figure 516 shows typical examples. Numeric 
values can be obtained from Table A5 in App. 5 or from your CAS. 

The mean of the binomial distribution is (see Team Project 16) 

(3) fx = np 
and the variance is (see Team Project 16) 

(4) a 2 = npq. 



Fig. 516. Probability function (2) of the binomial distribution for n = 5 and various values of p 
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EXAMPLE 1 


For the symmetric case of equal chance of success and failure (p = q = 1/2) this gives 
the mean nil , the variance nl 4, and the probability function 


( 2 *) 



(x = 0, 1, • • • , n). 


Binomial Distribution 


Compute the probability of obtaining at least two “Six” in rolling a fair die 4 times. 


Solution . p = P(A) = P(“Six") = 1/6, q = 5/6, /i = 4. The event “At least two ‘Six’” occurs if we obtain 
2 or 3 or 4 “Six ” Hence the answer is 


P = /( 2) + /( 3) + m = 



1 

= -4 ( 6 • 25 + 4 • 5 4- |) = 


171 

1296 


= 13.2%. 


Poisson Distribution 

The discrete distribution with infinitely many possible values and probability function 

(5) f(x) = ^ e~» (x = 0, 1, • • •) 

is called the Poisson distribution, named after S. D. Poisson (Sec. 18.5). Figure 517 
shows (5) for some values of p. It can be proved that this distribution is obtained as a 
limiting case of the binomial distribution, if we let p — > 0 and n — » 00 so that the mean 
fx = np approaches a finite value. (For instance, p = np may be kept constant.) The 
Poisson distribution has the mean p and the variance (see Team Project 16) 

(6) a- 2 = ix. 

Figure 517 gives the impression that with increasing mean the spread of the distribution 
increases, thereby illustrating formula (6), and that the distribution becomes more and 
more (approximately) symmetric. 



Fig. 517. Probability function (5) of the Poisson distribution for various values of p 
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EXAMPLE 2 


EXAMPLE 3 


Poisson Distribution 

If the probability of producing a defective screw is p — 0.0 1, what is the probability that a lot of 100 screws 
will contain more than 2 defectives? 

Solution . The complementary event is A c : Not more than 2 defectives. For its probability we get from the 
binomial distribution with mean pi = ttp = 1 the value [see (2)] 

P(A C ) = 0.99 100 + 0.01 -0.99" + 0.01 2 -0.99 98 . 

Since p is very small, we can approximate this by the much more convenient Poisson distribution with mean 
p = np = 100-0.01 = I. obtaining [see (5)1 


P(A C ) = e~ l 




= 91.97%. 


Thus /W = 8.03%. Show that the binomial distribution gives P{A ) = 7.94%, so that the Poisson approximation 
is quite good. I 

Parking Problems. Poisson Distribution 

If on the average. 2 cars enter a certain parking lot per minute, what is the probability that during any given 
minute 4 or more cars will enter the lot? 

Solution. To understand that the Poisson distribution is a model of the situation, we imagine the minute to 
be divided into very many short time intervals, let p be the (constant) probability that a car will enter the lot 
during any such short interval, and assume independence of the events that happen during those intervals. Then 
we are dealing with a binomial distribution with very large n and very small p, which we can approximate by 
the Poisson distribution with 

p = np = 2, 

because 2 cars enter on the average. The complementary event of the event *‘4 cars or more during a given 
minute” is “3 cars or fewer enter the lor and has the probability 

/ 2 ° 

/<0) + /(!) + /( 2) + /( 3) = e~ 2 I — 

= 0.857. 

Answer: 14.3%. ( Why did we consider that complement?) 



Sampling with Replacement 

This means that we draw things from a given set one by one, and after each trial we 
replace the thing drawn (put it back to the given set and mix) before we draw the next 
thing. This guarantees independence of trials and leads to the binomial distribution. 
Indeed, if a box contains N things, for example, screws, M of which are defective, the 
probability of drawing a defective screw in a trial is p = M/N. Hence the probability of 
drawing a nondefective screw is q = 1 - p = 1 - M/N \ and (2) gives the probability of 
drawing * defectives in n trials in the form 


(7) 


~ ■ o ffl (’ - ?r 


(a* = 0, l, • • * , n). 
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EXAMPLE 4 


Sampling without Replacement. 

Hypergeometric Distribution 

Sampling without replacement means that we return no screw to the box. Then we no 
longer have independence of trials (why?), and instead of (7) the probability of drawing 
x defectives in n trials is 


( 8 ) 


/<*) = 


(3d 


(x = 0, 1, • • • , n). 


The distribution with this probability function is called the hypergeometric distribution 
(because its moment generating function (see Team Project 16) can be expressed by the 
hypergeometric function defined in Sec. 5.4, a fact that we shall not use). 

Derivation of (8). By (4a) in Sec. 24.4 there are 

(N' 


(a) ( ^ J different ways of picking n things from N, 


(b) 


(c) 


o* 


different ways of picking x defectives from M, 


\n-x/ 


different ways of picking n — x nondefectives from N - M, 


and each way in (b) combined with each way in (c) gives the total number of mutually 
exclusive ways of obtaining x defectives in n drawings without replacement. Since (a) is 
the total number of outcomes and we draw at random, each such way has the probability 


l/(n) * ^ rom ^s, (8) follows. 


The hypergeometric distribution has the mean (Team Project 16) 

M 


( 9 ) 

and the variance 

( 10 ) 


/x = n 


N 


a 2 = 


nM(N ~ M)(N - n) 
N 2 (N - 1) 


Sampling with and without Replacement 

We want to draw random samples of two gaskets from a box containing 10 gaskets, three of which are defective. 
Find the probability function of the random variable X = Number of defectives in the sample. 

Solution . We have N = 10, M — 3, N — M = 7, n = 2. For sampling with replacement. (7) yields 
/M = (*) (-^) r (f^) 2 *> m = 0.49, /(I) = 0.42, /(2) = 0.09. 

For sampling without replacement we have to use (8), finding 

m = (x) (2 - ,-)/('°) • 0.47. /( 2) = ^ - 0.07. ■ 
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If N, M, and N — M are large compared with n, then it does not matter too much whether 
we sample with or without replacement, and in this case the hyper geometric distribution 
may be approximated by the binomial distribution (with p = MIN), which is somewhat 
simpler. 

Hence in sampling from an indefinitely large population (“infinite population") we 
may use the binomial distribution , regardless of whether we sample with or without 
replacement. 


1. Four fair coins are tossed simultaneously. Find the 
probability function of the random variable X = Number 
of heads and compute the probabilities of obtaining no 
heads, precisely I head, at least 1 head, not more than 
3 heads. 

2. If the probability of hitting a target in a single shot is 
10% and 10 shots are fired independently, what is the 
probability that the target will be hit at least once? 

3. In Prob. 2, if the probability of hitting would be 5% 
and we fired 20 shots, would the probability of hitting 
at least once be less than, equal to. or greater than in 
Prob. 2? Guess first, then compute. 

4. Suppose that 3% of bolts made by a machine are 
defective, the defectives occurring at random during 
production. If the bolts are packaged 50 per box, 
what is the Poisson approximation of the probability 
that a given box will contain .v = 0, 1, • * • . 5 
defectives? 

5. Let X be the number of cars per minute passi ng a certain 
point of some road between 8 a.m. and 10 a.m. on a 
Sunday. Assume that X has a Poisson distribution with 
mean 5. Find the probability of observing 3 or fewer 
cars during any given minute. 

6. Suppose that a telephone switchboard of some 
company on the average handles 300 calls per hour, 
and that the board can make at most 10 connections 
per minute. Using the Poisson distribution, estimate the 
probability that the board will be overtaxed during a 
given minute. (Use Table A6 in App. 5 or your CAS.) 

7. (Rutherford-Geiger experiments) In 1910, E. 
Rutherford and H. Geiger showed experimentally that 
the number of alpha particles emitted per second in a 
radioactive process is a random variable X having a 
Poisson distribution. If X has mean 0.5, what is the 
probability of observing two or more particles during 
any given second? 

8. A process of manufacturing screws is checked every 
hour by inspecting n screws selected at random from 
that hour’s production. If one or more screws are 
defective, the process is halted and carefully examined. 
How large should n be if the manufacturer wants the 
probability to be about 95% that the process will be 


halted when 10% of the screws being produced are 
defective? (Assume independence of the quality of any 
screw of that of the other screws.) 

9. Suppose that in the production of 50-fl resistors, 
nondefective items are those that have a resistance 
between 45 H and 55 Cl and the probability of a 
resistor’s being defective is 0.2%. The resistors are sold 
in lots of 100, with the guarantee that all resistors are 
nondefective. What is the probability that a given lot 
will violate this guarantee? (Use the Poisson 
distribution.) 

10. Let p = 1% be the probability that a certain type of 
lightbulb will fail in a 24-hr test. Find the probability 
that a sign consisting of 10 such bulbs will bum 24 
hours with no bulb failures. 

11. Guess how much less the probability in Prob. 10 would 
be if the sign consisted of 100 bulbs. Then calculate. 

12. Suppose that a certain type of magnetic tape contains, 
on the average, 2 defects per 100 meters. What is the 
probability that a roll of tape 300 meters long will 
contain (a) a* defects, (b) no defects? 

13. Suppose that a test for extrasensory perception consists 
of naming (in any order) 3 cards randomly drawn from 
a deck of 13 cards. Find the probability that by chance 
alone, the person will correctly name (a) no cards, 
(b) 1 card, (c) 2 cards, (d) 3 cards. 

14. A carton contains 20 fuses, 5 of which are defective. 
Find the probability that, if a sample of 3 fuses is 
chosen from the carton by random drawing without 
replacement, ,v fuses in the sample will be defective. 

15. (Multinomial distribution) Suppose a trial can result in 
precisely one of k mutually exclusive events A 1? • • • , A k 
with probabilities p x , • • • , p k , respectively, where 
p 1 + • • • + p k = 1, Suppose that n independent trials 
are performed. Show that the probability of getting 
*i s, ■ • • , x k A k s is 

/(•Vl,---.A fc )= ■ " ! , Pl Xi --- Pk Xk 

Al> A^! 

where 0 ^ Xj ^ n, j = 1, • • • , and 
a*! + • • • + x k = n. The distribution having this 
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probability function is called the multinomial 
distribution. 

16. TEAM PROJECT. Moment Generating Function. 
The moment generating function G(t) is defined by 

GO) = E(e tX ’) = 2 e * J /C*> 
j 

or 

G(t) = E(e lX ) = f e tx f(x) dx 

— OC 

where X is a discrete or continuous random variable, 
respectively. 

(a) Assuming that termwise differentiation and 
differentiation under the integral sign are permissible, 
show that E(X k ) = C <te) (0), where G (fc) = d k G/di k , in 
particular, p = G r (0). 


(b) Show that the binomial distribution has the 
moment generating function 

GO) = 2 J* (") p v* - 2 (") (pW* 

.r-0 VV ,r=0 W 

= (pe* + q) n . 

(c) Using (b), prove (3). 

(d) Prove (4). 

(e) Show that the Poisson distribution has the moment 

generating function G(t) = and prove (6). 

• co— (-=:)■ 

Using this, prove (9). 


24.8 Normal Distribution 

Turning from discrete to continuous distributions, in this section we discuss the normal 
distribution. This is the most important continuous distribution because in applications 
many random variables are normal random variables (that is, they have a normal 
distribution) or they are approximately normal or can be transformed into normal random 
variables in a relatively simple fashion. Furthermore, the normal distribution is a useful 
approximation of more complicated distributions, and it also occurs in the proofs of various 
statistical tests. 

The normal distribution or Gauss distribution is defined as the distribution with the 
density 


( 1 ) 





2 



(cr> 0) 


where exp is the exponential function with base e = 2.718 • • • . This is simpler than it 
may at first look. f(x) has these features (see also Fig. 518). 

1. /x is the mean and cr the standard deviation. 

2. l/(crV27r) is a constant factor that makes the area under the curve of f(x) from — 
to oc equal to 1, as it must be by (10), Sec. 24.5. 

3. The curve of /( x) is symmetric with respect to x = fi because the exponent is 
quadratic. Hence for jjl = 0 it is symmetric with respect to the y-axis x = 0 
(Fig. 518, “bell-shaped cur\>es”). 

4. The exponential function in (1) goes to zero very fast— the faster the smaller the 
standard deviation cris, as it should be (Fig. 518). 
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Fig. 518. Density (1) of the normal distribution with /jl = 0 for various values of a 


Distribution Function F[x) 

From (7) in Sec. 24.5 and ( l ) we see that the normal distribution has the distribution 
function 


( 2 ) 


FM = 


<tV27 t 


L exp [ _ 1 (V 1 ) 2 ] dv - 


Here we needed x as the upper limit of integration and wrote v (instead of x) in the integrand. 

For the corresponding standardized normal distribution with mean 0 and standard 
deviation l we denote F(a‘) by <P(z). Then we simply have from (2) 

(3) <fr(s) = -4= / <r“ 2/2 du. 

v 27 T •'-oc 


This integral cannot be integrated by one of the methods of calculus. But this is no serious 
handicap because its values can be obtained from Table A7 in App. 5 or from your CAS. 
These values are needed in working with the normal distribution. The curve of 3>(z) is 
S-shaped. It increases monotone (why?) from 0 to 1 and intersects the vertical axis at 
1/2 (why?), as shown in Fig. 519. 

Relation Between F(x) and <£(z). Although your CAS will give you values of F(x) in 
(2) with any /x and a directly, it is important to comprehend that and why any such an 
F(x) can be expressed in terms of the tabulated standard <&(z), as follows. 



Fig. 519. Distribution function 0>(z) of the normal distribution with mean 0 and variance 1 
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THEOREM 1 


PROOF 


THEOREM 2 


PROOF 


Use of the Normal Table A7 in App. 5 

The distribution function F(x) of the normal distribution with any p and a [see (2)] 
is related to the standardized distribution function 3>(z) in (3) by the formula 

r ( x — LL 

(4) F{x) = <*> 


Comparing (2) and (3) we see that we should set 

V — fJL X — fl 

u = . Then v = x gives u = 

a or 

as the new upper limit of integration. Also v — p = crw, thus dv = adu. Together, since 
cr drops out. 

Probabilities corresponding to intervals will be needed quite frequently in statistics in 
Chap. 25. These are obtained as follows. 


Normal Probabilities for Intervals 

The probability that a normal random variable X with mean p and standard 
deviation cr assume any value in an interval a < x ^ b is 


(5) 


P(a<X^b) = Fib) - Fid) = $ | ^ j - «I> . 


Formula (2) in Sec. 24.5 gives the first equality in (5), and (4) in this section gives the 
second equality. ■ 

Numeric Values 

In practical work with the normal distribution it is good to remember that about 2/3 of all 
values of X to be observed will lie between p ± cr, about 95% between p ± 2 cr, and practically 
all between the three-sigma limits p ± 3cr. More precisely, by Table A7 in App. 5, 

(a) P(p — cr<X^/x+cr)~ 68% 

(6) (b) P(p - 2a < X S p + 2o) ~ 95.5% 

(c) P(p - 3a < X ^ p + 3cr) ~ 99.7%. 

Formulas (6a) and (6b) are illustrated in Fig. 520. 

The formulas in (6) show that a value deviating from p by more than tr, 2cr, or 3cr will 
occur in one of about 3, 20, and 300 trials, respectively. 
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EXAMPLE 1 


EXAMPLE 2 



Fig. 520. Illustration of formula (6) 


In tests (Chap. 25) we shall ask conversely for the intervals that correspond to certain 
given probabilities; practically most important are the probabilities of 95%, 99%, and 
99.9%. For these, Table A8 in App. 5 gives the answers jx ± 2a , jx ± 2.5 a, and 
fx ± 3.3 <7, respectively. More precisely. 


(7) 


(a) P(fx - 1.96cr < X ^ fx + 1 .96cr) = 95% 

(b) P(fx ~ 2.58(7 < X ^ fx -f 2.58 a) = 99% 

(c) P(fx - 3.29(7 < X ^ /X + 3.29 a) = 99.9%. 


Working With the Normal Tables A7 and A8 in App. 5 

There are two normal tables in App. 5, Tables A7 and A8. If you want probabilities, use 
Table A7. If probabilities are given and corresponding intervals or x-values are wanted, 
use Table A8. The following examples are typical. Do them with care, verifying all values, 
and don’t just regard them as dull exercises for your software. Make sketches of the density 
to see whether the results look reasonable. 

Reading Entries from Table A7 

If X is standardized normal (so that y. = 0, a = 1), then 

P(X g 2.44) = 0.9927 *= 99*% 

P(X S -1.16) = 1 - $(1.16) = I - 0.8770 = 0.1230 = 12.3% 

P(X a 1) = 1 — P(X =g 1) = 1 - 0.8413 = 0.1587 by (7), Sec. 24.3 

P(I.O s X s 1.8) = (1.8) - <t>(1.0) = 0.9641 - 0.8413 = 0.1228. ■ 

Probabilities for Given Intervals, Table A7 

Let X be normal with mean 0.8 and variance 4 (so that <r = 2). Then by (4) and (5) 

/ 2.44 - 0.80 \ 

P(X S 2.44) = F(2M) = 4>l 1 = d>(0.82) = 0.7939 ~ 80% 

or if you like it better (similarly in the other cases) 

IX- 0.80 2.44 - 0.80 \ 

P(X s 2.44) = Pi s J = P(ZS 0.82) = 0.7939 

P(X S I) ■ I - P(X Sl)=l- $(— 2 °' 8 j = I - 0.5398 = 0.4602 


Pd-O SXi 1.8) = $(0.5) - $(0.1) = 0.6915 - 0.5398 = 0.1517. 
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EXAMPLE 3 Unknown Values c for Given Probabilities, Table A8 

Let X be normal with mean 5 and variance 0.04 (hence standard deviation 0.2). Find c or k corresponding to 
the given probability 

P(X Sf) = 95%. = 95%. = 1.645. c = 5.329 

f><5 - k £ X £ 5 + k) = 90%, 5 + k = 5.329 (as before; why?) 

c — 5 

P(X ^ c) = 1%. thus P(X ^ c) = 99%. -jpj- = 2.326, c = 5.465. ■ 


EXAMPLE 4 Defectives 


in a production of iron rods let the diameter X be normally distributed with mean 2 in. and standard deviation 
0.008 in. 

(a) What percentage of defectives can we expect if we set the tolerance limits at 2 ± 0.02 in.? 

(b) How should we set the tolerance limits to allow for 4% defectives? 

Solution . (a) 1 5 % because from (5) and Table A7 we obtain for the complementary event the probability 


/>(1.98^X=2 2.02) 


/ 2.02 - 2.00 \ / 1.98 - 2.00 \ 

\ 0.008 ) t 0.008 / 

$(2.5) - $(-2.5) 

0.9938 - (1 - 0.9938) 

0.9876 


= 98|%. 


(b) 2 ± 0.0164 because for (he complementary event we have 


0.96 = P(2 - c S X S 2 + c) 


or 

0.98 = P(X ^ 2 + c) 


so that Table A 8 gives 


(2 + c - 2 \ 
0.98 = <S> - ■ B , 

\ 0.008 / 


2 + c ~ 2 
0.008 


= 2.054. 


c = 0.0164. ■ 


Normal Approximation of the Binomial Distribution 

The probability function of the binomial distribution is (Sec. 24.7) 


( 8 ) 



(x = 0, 1, • • • , n). 


If n is large, the binomial coefficients and powers become very inconvenient. It is of great 
practical (and theoretical) importance that in this case the normal distribution provides a 
good approximation of the binomial distribution, according to the following theorem, one 
of the most important theorems in all probability theory. 
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THEOREM 3 


Limit Theorem of De Moivre and Laplace 

For large n , 

(9) fix) ~ fix) ix - 0, 1, • • • , n). 


Here f is given by (8). The function 


( 10 ) 



is the density of the normal distribution with mean p = np and variance cr 2 = npq 
(the mean and variance of the binomial distribution). The symbol ~~ (read 
asymptotically equal) means that the ratio of both sides approaches 1 as n 
approaches *>. Furthermore , for any nonnegative integers a and b (> a), 

P(a^Xtkb) = ^ p x q n ~ x - <J>(j8) - <D(a), 

(II) x==a ' 

a — np — 0.5 ~ b — np -1- 0.5 

a = "7= , p= "7 • 

Vnptf v npq 


A proof of this theorem can be found in [G3] listed in App. 1. The proof shows that the 
term 0.5 in a and /3 is a correction caused by the change from a discrete to a continuous 
distribution. 


PROBLEM SET 24.8 


1 1-13 1 NORMAL DISTRIBUTION 

1. Let X be normal with mean 80 and variance 9. 

Find P(X > 83), P(X < 81), P(X < 80), and 
P( 78 < X < 82). 

2. Let X be normal with mean 120 and variance 16. Find 
P(X ^ 126), P(X > 1 16), F(I25 < X < 130). 

3. Let X be normal with mean 1 4 and variance 4. Determine 
c such that P(X 3 c) = 95%. P(X ^ c) = 5%, 
P(X ^ c) = 99.5%. 

4. Let X be normal with mean 4.2 and variance 0.04. 
Find c such that P(X ^ c) = 50%, P(X > c) = 10%, 
P(-c <X-4.2£c) = 99%. 

5. If the lifetime X of a certain kind of automobile 
battery is normally distributed with a mean of 4 yr 
and a standard deviation of 1 yr, and the manufacturer 
wishes to guarantee the battery for 3 yr, what 
percentage of the batteries will he have to replace 


under the guarantee? 

6. If the standard deviation in Prob. 5 were smaller, 
would that percentage be smaller or larger? 

7. A manufacturer knows from experience that the 
resistance of resistors he produces is normal with mean 
p, = 150 H and standard deviation o = 5 Cl. What 
percentage of the resistors will have resistance between 
148 H and 152 H? Between 140 O and 160 fl? 

8. The breaking strength X fkgl of a certain type of 
plastic block is normally distributed with a mean of 
1250 kg and a standard deviation of 55 kg. What is 
the maximum load such that we can expect no more 
than 5% of the blocks to break? 

9. A manufacturer produces airmail envelopes whose 
weight is normal with mean p = 1.950 grams and 
standard deviation a = 0.025 grams. The envelopes 
are sold in lots of 1000. How many envelopes in a lot 
will be heavier than 2 grams? 
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10. If the resistance X of certain wires in an electrical 
network is normal with mean 0.01 ft and standard 
deviation 0.001 ft, how many of 1000 wires will meet 
the specification that they have resistance between 

0.009 and 0.011 ft? 

11. If the mathematics scores of the SAT college entrance 
exams are normal with mean 480 and standard 
deviation 100 (these are about the actual values over 
the past years) and if some college sets 500 as the 
minimum score for new students, what percent of 
students will not reach that score? 

12. If the monthly machine repair and maintenance cost X 
in a certain factory is known to be normal with mean 
$12000 and standard deviation $2000. what is the 
probability that the repair cost for the next month will 
exceed the budgeted amount of $15000? 

13. If sick-leave time X used by employees of a company 
in one month is (very roughly) normal with mean 1000 
hours and standard deviation 100 hours, how much 
time t should be budgeted for sick leave during the next 
month if r is to be exceeded with probability of only 
20 %? 

14. TEAM PROJECT. Normal Distribution, (a) Derive 
the formulas in (6) and (7) from the appropriate normal 
table. 

(b) Show that $(-<:) = I - Give an example. 

(c) Find the points of inflection of the curve of (1). 


(d) Considering <I> 2 (*>) and introducing polar 
coordinates in the double integral (a standard trick 
worth remembering), prove 

i 

(12) <!>(«)=—= e~ u2,z du = I. 

V 277 -'—oo 

(e) Show that <r in ( 1 ) is indeed the standard deviation 
of the normal distribution. [Use (12).] 

(f) Bernoulli’s law of large numbers. In an experiment 
let an event A have probability p (0 < p < 1 ), and let 
X be the number of times A happens in n independent 
trials. Show that for any given e > 0, 

X I \ 

/; I ^ € I — » 1 as n sc. 

« I / 

(g) Transformation. If X is normal with mean p, 
and variance cr 2 , show that X* = c x X 4- c 2 (c x > 0) 
is normal with mean p* = c\p, + c 2 and variance 
o-* 2 = CjV 2 , 

15. WRITING PROJECT. Use of Tables. Give a 

systematic discussion of the use of Tables A7 and A8 

for obtaining P(X < b ), P{X > a)> P(a < X < b), 
P(X < c) = L P(X > c) = k , as well as 
P(p, - c < X < p, + c) = k; include simple examples. 
If you have a CAS, describe to what extent it makes 
the use of those tables superfluous; give examples. 


24.9 Distributions of Several Random Variables 

Distributions of two or more random variables are of interest for two reasons: 

1. They occur in experiments in which we observe several random variables, for 
example, carbon content X and hardness Y of steel, amount of fertilizer X and yield of 
corn F, height X lf weight X 2 , and blood pressure X 3 of persons, and so on. 

2. They will be needed in the mathematical justification of the methods of statistics in 
Chap. 25. 

In this section we consider two random variables X and Y or, as we also say, a two- 
dimensional random variable (X, Y). For (X, Y) the outcome of a trial is a pair of numbers 
X = a\ Y = y, briefly (X, Y) = (a*, y), which we can plot as a point in the XY-plane. 

The two-dimensional probability distribution of the random variable (X, Y) is given 
by the distribution function 

(1) F(x, y) = P(X = a*, Y ^ y). 

This is the probability that in a trial, X will assume any value not greater than x and in 
the same trial, Y will assume any value not greater than y. This corresponds to the blue 
region in Fig. 521, which extends to to the left and below. F(x, y ) determines the 
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EXAMPLE 1 


Y 

(x,y) 






X 


Fig. 521. Formula (1) 


probability distribution uniquely, because in analogy to formula (2) in Sec. 24.5, that is, 
P(a < X ^ b) = F(b) — F(a ), we now have for a rectangle (see Prob. 14) 

(2) P(a x <X^b l9 a 2 < Y ^ b 2 ) = F(b l9 b 2 ) - F(a l9 b 2 ) - F(b l9 a 2 ) + F(a l9 a 2 ). 

As before, in the two-dimensional case we shall also have discrete and continuous 
random variables and distributions. 


Discrete Two-Dimensional Distributions 

In analogy to the case of a single random variable (Sec. 24.5), we call ( X , y) and its 
distribution discrete if (X, Y) can assume only finitely many or at most countably infinitely 
many pairs of values (x l9 y x ) 9 (x 2 , y 2 ), • • • with positive probabilities, whereas the 
probability for any domain containing none of those values of (X, Y) is zero. 

Let {x h yj) be any of those pairs and let P{X = x it Y = yj) = p ^ (where we admit that 
Pij may be 0 for certain pairs of subscripts ij). Then we define the probability function 
/(** y) of (X, Y) by 

(3) /(a*, y) = Pjj if a = x h y = and /(a, v) = 0 otherwise; 

here, / = 1, 2, • • • and j = 1, 2, • • • independently. In analogy to (4), Sec. 24.5, we now 
have for the distribution function the formula 

(4) F(x, y) = S E fOCi, )$. 

Xi^x Vj^y 


Instead of (6) in Sec. 24.5 we now have the condition 


(5) 2 2 f(*i> yj) = l- 

i j 

Two-Dimensional Discrete Distribution 

If we simultaneously loss a dime and a nickel and consider 

X = Number of heads the dime turns up, 
Y = Number of heads the nickel turns up, 
then X and Y can have the values 0 or J . and the probability function is 


/(0. 0) — /(I, 0) — /( 0, I) — /(l, 1) = f(x , y) = 0 otherwise. 



1034 


CHAP. 24 Data Analysis. Probability Theory 


EXAMPLE 2 


Y 

*2- 


I I I 

a l *1 * 

Fig. 522. Notion of a two-dimensional distribution 

Continuous Two-Dimensional Distributions 

In analogy to the case of a single random variable (Sec. 24.5) we call (X, Y) and its 
distribution continuous if the corresponding distribution function F(a\ y) can be given by 
a double integral 


( 6 ) 


F(x, .v) = / 


y 

-oc 


j /(**, y*) dx* dy* 


whose integrand /, called the density of ( X , Y ), is nonnegative everywhere, and is 
continuous, possibly except on finitely many curves. 

From (6) we obtain the probability that (X, Y) assume any value in a rectangle 
(Fig. 522) given by the formula 

*>2 I 

J f(x,y)dxdy. 

12 at 


( 7 ) 


Two-Dimensional Uniform Distribution in a Rectangle 

Let R be the rectangle a x < x ^ /5 lf ct 2 < v ^ /3 2 . The density (see Fig. 523) 

(8) /(.v, y) = Uk if (x, y) is in R , /(.v, y) = 0 otherwise 

defines the so-called uniform distribution in the rectangle R; here k = (p l — oriXfe “ a 2) ls the area of R. 
The distribution function is shown in Fig. 524. I 




Fig. 523. Density function (8) of the Fig. 524. Distribution function of the 

uniform distribution uniform distribution defined by (8) 


Marginal Distributions of a Discrete Distribution 

This is a rather natural idea, without counterpart for a single random variable. It amounts 
to being interested only in one of the two variables in (X, Y), say, X, and asking for its 
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EXAMPLE 3 


distribution, called the marginal distribution of X in (X, Y). So we ask for the probability 
P(X = .v, y arbitrary). Since (X, Y) is discrete, so is X. We get its probability function, 
call it fi{x) 9 from the probability function fix , y) of (X, Y) by summing over y: 

(9) AW = P(X = x,Y arbitrary) = X f(x, y) 

y 

where we sum all the values of /(a*, y) that are not 0 for that a*. 

From (9) we see that the distribution function of the marginal distribution of X is 

(10) F 1 (x) = P(X § A-, Y arbitrary) = 2 AC**). 

X*^X 

Similarly, the probability function 

(11) / 2 (>') = P(X arbitrary, Y = y) = 2 /(*. .v) 


determines the marginal distribution of Y in (X, Y). Here we sum all the values of 
/(a*, y) that are not zero for the corresponding y. The distribution function of this marginal 
distribution is 

(12) F 2 (y) = P(X arbitrary, Y ^ y) = 2 A O'*)- 

y*^y 


Marginal Distributions of a Discrete Two-Dimensional Random Variable 

In drawing 3 cards with replacement from a bridge deck let us consider 

(X. Y). X = Number of queens. Y = Number of kings or aces. 


The deck has 52 cards. These include 4 queens. 4 kings, and 4 aces. Hence in a single trial a queen has probability 
4/52 = 1/13 and a king or ace 8/52 = 2/13. This gives the probability function of (X, Y), 


/(.v. y) = 


A’! y! (3 


3! /j_\* /jzy / _m 

-A -v)! \ 13 / \I3J l 13/ 


(x + y ^ 3 ) 


and f(x, y) — 0 otherwise. Table 24.1 shows in the center the values of f(\\ y) and on the right and lower margins 
the values of the probability functions fi(x) and / 2 (y) of the marginal distributions of X and T, respectively. M 


Table 24.1 Values of the Probability Functions /(x, y), /,(x), / 2 (y) in Drawing 
Three Cards with Replacement from a Bridge Deck, where X is the Number 
of Queens Drawn and Y is the Number of Kings or Aces Drawn 


X 

X 

0 

l 

2 

3 

AW 

0 

1000 

2197 

600 

2197 

120 

2197 

2l97 

1728 

2197 

1 

300 

2197 

120 

2197 

12 

2197 

0 

432 

2197 

2 

2197 

2^7 

0 

0 

Ksh 

3 

2lW 

0 

0 

0 

1 

2197 

j /200 

1331 

2197 

726 

2197 

2l97 

_ 8 _ 

2197 
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EXAMPLE 


Marginal Distributions of a Continuous Distribution 

This is conceptually the same as for discrete distributions, with probability functions and 
sums replaced by densities and integrals. For a continuous random variable (X, Y) with 
density f(x ; y) we now have the marginal distribution of X in (X, 7), defined by the 
distribution function 

(13) F x (x) = P(X = A‘, co < Y < cc) = f f\(x*) clx* 

— 00 

with the density f x of X obtained from f(x, y) by integration over y. 


(14) 


fi(x) = f f(x, y) dy. 

— CO 


Interchanging the roles of X and Y , we obtain the marginal distribution of Y in (X, Y) 
with the distribution function 


(15) 


F 2 (y) = P(- oo < X < oo, y £ y) = f f 2 (y*) df 


and density 


(16) f 2 (y) = f f(*,y)dx. 

J -cc 


Independence of Random Variables 

X and Y in a (discrete or continuous) random variable (X, Y) are said to be independent 
if 

(17) F(*, y) = F x (x)F 2 (y) 

holds for all ( x , y). Otherwise these random variables are said to be dependent. These 
definitions are suggested by the corresponding definitions for events in Sec. 24.3. 
Necessary and sufficient for independence is 

(18) /(*. >0 = fi(x)f 2 (y) 

for all .v and y. Here the fs are the above probability functions if (X, Y) is discrete or 
those densities if (X, Y) is continuous. (See Prob. 20.) 

Independence and Dependence 

In tossing a dime and a nickel, X = Number of heads on the dime, Y = Number of heads on the nickel may 
assume the values 0 or 1 and are independent. The random variables in Table 24.1 are dependent. ■ 


Extension of Independence to /{-Dimensional Random Variables. This will be needed 
throughout Chap. 25. The distribution of such a random variable X = (X lt • • • , X n ) is 
determined by a distribution function of the fo rm 
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F(xi, ••*>•*„) = P(X i ^ Ai, • • • , X n S ,v n ). 


The random variables X lt • • • , X n are said to be independent if 

(19) F(x x , ■ ,x„) = F x (x x )F 2 (x 2 ) • • • F n ( x n ) 

for all (a'j, • • • , x n ). Here Fj(Xj) is the distribution function of the marginal distribution 
of Xj in X, that is. 


Fj(xj) = P(Xj g xj, X k arbitrary, k = 1, • • • , n, k # j). 


Otherwise these random variables are said to be dependent. 


Functions of Random Variables 

When n = 2, we write X x = X, X 2 = Y , x 1 = x , x 2 = y. Taking a nonconstant continuous 
function g(x, y ) defined for all x , y, we obtain a random variable Z = g(X, Y). For example, 
if we roll two dice and X and Y are the numbers the dice turn up in a trial, then 
Z = X + Y is the sum of those two numbers (see Fig. 513 in Sec. 24.5). 

In the case of a discrete random variable ( X , Y ) we may obtain the probability function 
f(z ) of Z = g(X , Y) by summing all /(a, y) for which #(a, y) equals the value of z 
considered; thus 

(20) f(z) = P(Z = z) = 22 fix, y). 

g(x,y)=z 


Hence the distribution function of Z is 


(21) F(z) = P(Z g;) = 22 fix, y) 

g(x,j/)^z 

where we sum all values of f(x ; y) for which #(a, y) ^ z. 

In the case of a continuous random variable ( X , Y) we similarly have 

(22) F(z) = P(Z S z) = JJ f(x, y) dx dy 

g(x,y)^z 

where for each z we integrate the density /(a, y) of (X, T) over the region g(x, y) ^ z in 
the Ay-plane, the boundary curve of this region being g( a, y) = z. 


Addition of Means 

The number 


(23) 


E{g{X, Y)) = 


2 2 gix , y)fix, y) 

x y 

/ I Six, y)f(x, y) dx dy 

— 3C —SC 


[(X, Y) discrete] 
[(X, Y) continuous] 



1038 


CHAP. 24 Data Analysis. Probability Theory 


is called the mathematical expectation or, briefly, the expectation of g(X , Y). Here it is 
assumed that the double series converges absolutely and the integral of |g(A*, y)\f{x. y) over 
the .xy-plane exists (is finite). Since summation and integration are linear processes, we 
have from (23) 

(24) E{ag{X, Y) + bh{X, Y)) = aE(g(X, Y)) + bE(h(X, Y)). 

An important special case is 


E(X -f- Y) = E(X) + E(Y ), 
and by induction we have the following result. 


THEOREM 


Addition of Means 

The mean ( expectation ) of a sum of random variables equals the sum of the means 
(i expectations ), that is, 

(25) E(X x 4- X 2 + • • • + X n ) = E(X x ) 4- E(X 2 ) 4- ■ • • + E(X n ). 


Furthermore, we readily obtain 


THEROEM 2 


Multiplication of Means 

The mean ( expectation ) of the product of independent random variables equals the 
product of the means ( expectations ), that is, 

(26) E{X y X 2 • ■ • X^ = E(X x )E(X 2 ) * • • E(X„). 


PROOF U X and Y are independent random variables (both discrete or both continuous), then 
E(XY) = E(X)E(Y). In fact, in the discrete case we have 

E(XY) = 22 xyf(x, y) = 2 JtfiW 2 .v/ 2 (v) = E(X)E(Y), 

x y x y 

and in the continuous case the proof of the relation is similar. Extension to n independent 
random variables gives (26), and Theorem 2 is proved. ■ 

Addition of Variances 

This is another matter of practical importance that we shall need. As before, let Z = X 4- Y 
and denote the mean and variance of Z by p and a 2 . Then we first have (see Team Project 
16(a) in Problem Set 24.6) 


<r 2 = E([Z - rf) = E(Z 2 ) - [E(Z)f. 
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From (24) we see that the first term on the right equals 

£(Z 2 ) = E{X 2 + 2XY + Y 2 ) = E(X 2 ) 4- 2E(XY) + E(Y 2 ). 

For the second term on tlie right we obtain from Theorem 1 

[E(Z)] 2 = [E(X) 4- E(Y)] 2 = [E(X)] 2 + 2 E(X)E(Y) + [E(Y)] 2 . 

By substituting these expressions into the formula for a 2 we have 

<t 2 = E(X 2 ) - [E(X)f 4- E(Y 2 ) - [E(Y)f 
+ 2[E(XY) - E(X)E(Y)l 

From Team Project 16, Sec. 24.6, we see that the expression in the first line on the right 
is the sum of the variances of X and Y , which we denote by a 2 and or 2 , respectively. 
The quantity in the second line (except for the factor 2) is 

(27) cr xy = E(XY) - E(X)E(Y) 


and is called the covariance of X and Y. Consequently, our result is 
(28) a 2 = Oi 2 + cr 2 -f 2 cr XY . 

If X and Y are independent, then 


E(XY) = E(X)E(J)\ 

hence Oxy = 0, and 

(29) cr 2 = a 2 4- or 2 2 . 

Extension to more than two variables gives the basic 


THEOREM 3 


Addition of Variances 

The variance of the sum of independent random variables equals the sum of the 
variances of these variables . 


CAUTION! In the numerous applications of Theorems 1 and 3 we must always 
remember that Theorem 3 holds only for independent variables. 

This is the end of Chap. 24 on probability theory. Most of the concepts, methods, and 
special distributions discussed in this chapter will play a fundamental role in the next 
chapter, which deals with mediods of statistical inference, that is, conclusions from 
samples to populations, whose unknown properties we want to know and try to discover 
by looking at suitable properties of samples that we have obtained. 
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gaivial 


1. Lei /(.v, v) = k when 8 ^ a ^ 12 and 0 ^ y ^ 2 and 
zero elsewhere. Find k. Find P{X ^ 1 1, 1 ^ F ^ 1.5) 
and P(9 ^ X ^ 13, F^i 1). 

2. Find P(X > 2, F > 2) and P(X ^ 1, F ^ 1) if (X. F) 
has the density /(.v, v) = 1/8 if a ^ 0, y ^ 0, .v 4- y = 4. 

3. Let /(.v, y) = k if a* > 0, y > 0, a *1- y < 3 and 0 
otherwise. Find k. Sketch /(a. y). Find P(X + F ^ 1 ), 
P(F > X). 

4. Find the density of the marginal distribution of X in 
Prob 2. 

5. Find the density of the marginal distribution of F in 
Fig. 523. 

6. If certain sheets of wrapping paper have a mean weight 
of 10 g each, with a standard deviation of 0.05 g, what 
are the mean weight and standard deviation of a pack 
of 10 000 sheets? 

7. What are the mean thickness and the standard deviation 
of transformer cores each consisting of 50 layers of 
sheet metal and 49 insulating paper layers if the metal 
sheets have mean thickness 0.5 mm each with a 
standard deviation of 0.05 mm and the paper layers 
have mean 0.05 mm each with a standard deviation of 
0.02 mm? 

8. If the weight of certain (empty) containers has mean 
2 lb and standard deviation 0. 1 lb, and if the filling of 
the containers has mean weight 75 lb and standard 
deviation 0.8 lb, what are the mean weight and standard 
deviation of filled containers? 

9. A 5-gear assembly is put together with spacers between 
the gears. The mean thickness of the gears is 5.020 cm 
with a standard deviation of 0.003 cm. The mean 
thickness of the spacers is 0.040 cm with a standard 
deviation of 0.002 cm. Find the mean and standard 
deviation of the assembled units consisting of 5 randomly 
selected gears and 4 randomly selected spacers. 

10. Give an example of two different discrete distributions 
that have the same marginal distributions. 

11. Show dial the random variables with the densities 


and 


/(a, y) = a + y 
g( a, y) = (a 4- |)(y 4- 1) 


12. Let X [cm] and F [cm] be the diameter of a pin and 
hole, respectively. Suppose that (X. F) has the 
density 

/(a, y) = 2500 if 
0.99 <.v< 1.01, 1.00 <y< 1.02 

and 0 otherwise, (a) Find the marginal distributions, 
(b) What is the probability that a pin chosen at random 
will fit a hole whose diameter is 1 .00? 

13. An electronic device consists of two components. Let 
X and F [months] be the length of time until failure of 
the first and second component, respectively. Assume 
that (X, F) has the probability density 

/(a, y) = 0.01«" ,ailaf+ » > if a > 0 and y > 0 

and 0 otherwise, (a) Are X and F dependent or 
independent? (b) Find the densities of the marginal 
distributions, (c) What is the probability that the first 
component has a lifetime of 10 months or longer? 

14. Prove (2). 

15. Find P(X > F) when (X, F) has the density 

/(.v. v) = 0.25e-°- 5( * +1,> if a g 0, y S 0 

and 0 otherwise. 

16. Let (X, F) have the density 

/(a, v) = k if a 2 4- y 2 < 1 


and 0 otherwise. Determine k. Find the densities of 
the marginal distributions. Find the probability 

P(X 2 + F 2 < 1/4). 

17. Let (X, F) have the probability function 

/( 0 , 0 ) = /( 1 , 1 ) = 1 / 8 , 

m 1) = /( L 0) = 3/8. 

Are X and F independent? 

18. Using Theorem I, obtain the formula for the mean of 
the hypergeometric distribution. Can you use Theorem 
3 to obtain the variance of that distribution? 


if 0 = x = 1.0^ y ^ 1 and /(a. y) = 0 and 
£(a, y) = 0 elsewhere, have the same marginal 
distribution. 


19. Using Theorems 1 and 3, obtain the formulas for the 
mean and the variance of the binomial distribution. 

20. Prove the statement involving (18). 
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STIONS AND PROBLEMS 


1. Why did we begin the chapter with a section on handling 
data? 

2. What are stem-and-leaf plots? Boxplots? Histograms? 
Compare their advantages. 

3. What quantities measure the average size of data? The 
spread? 

4. Why did we consider probability theory? What is its 
role in statistics? 

5. What do we mean by an experiment? By a random 
variable related with it? What are outcomes? Events? 

6. Give examples of experiments in which you have 
equally likely cases and others in which you don’t. 

7. State the definition of probability from memory. 

8. What is the difference between the concepts of a 
permutation and a combination? 

9. State the main theorems on probability. Illustrate them 
by simple examples. 

10. What is the distribution of a random variable? The 
distribution function? The probability function? The 
density? 

11. State the definitions of mean and variance of a random 
variable from memory. 

12. If P(A) = P(B) and A Q B, can A =£ B1 

13. If E ± S (= the sample space), can P(E) = 1? 

14. What distributions correspond to sampling with 
replacement and without replacement? 

15. When will an experiment involve a binomial 
distribution? A hypergeometric distribution? 

16. When will the Poisson distribution be a good 
approximation of the binomial distribution? 

17. What do you know about the approximation of the 
binomial distribution by the normal distribution? 

18. Explain the use of the tables of the normal distribution. 
If you have a CAS, how would you proceed without the 
tables? 

19. Can the probability function of a discrete random 
variable have infinitely many positive values? 

20. State the most important facts about distributions of two 
random variables and their marginal distributions. 

21. Make a stem-and-leaf plot, histogram, and boxplot of 
the data 22.5, 23.2. 22.1, 23.6, 23.3, 23.4, 24.0, 20.6, 
23.3. 

22. Do the same task as in Prob. 21, for the data 210, 213, 
209, 218, 210, 215, 204, 21 1, 216, 213. 


23. Find the mean, standard deviation, and variance in 
Prob. 21. 

24. Find the mean, standard deviation, and variance in 
Prob. 22. 

25. What are the outcomes of the sample space of 
X: Tossing a coin until the first Head appears ? 

26. What are the outcomes in the sample space of the 
experiment of simultaneously tossing three coins? 

27. A box contains 50 screws, five of which are defective. 
Find the probability function of the random variable 
X = Number of defective screws in drawing two screws 
without replacement and compute its values. 

28. Find the values of the distribution function in Prob. 27. 

29. Using a Venn diagram, show that A C B if and only if 
A \J B = B, 

30. Using a Venn diagram, show that A C B if and only if 
A D B = A. 

31. If X has the density /( *) = 0.5* (0 * ^ 2) and 

0 otherwise, what are the mean and the variance of 
X* = -2X 4- 5? 

32. If 6 different inks are available, in how many ways can 
we select two colors for a printing job? Four colors? 

33. Compute 5 ! by the Stirling formula and find the absolute 
and relative errors. 

34. Two screws are randomly drawn without replacement 
from a box containing 7 right-handed and 3 left- 
handed screws. Let X be the number of left-handed 
screws drawn. Find P(X = 0), P(X = 1), P(X = 2), 
P( 1 < X < 2), P(0 < X < 5). 

35. Find the mean and the variance of the distribution 
having the density /(*) = 

36. Find the skewness of the distribution with density 
/(*) = 2(1 — x) if 0 < * < 1, /( jc) = 0 otherwise. 

37. Sketch the probability function /(x) = x 2 /30 

(x = 1, 2, 3, 4) and the distribution function. Find p.. 

38. Sketch F(x) = 0 if* ^ 0, F(x) = 0.2* if 0 < * ^ 5, 
F(x) = 1 if* > 5, and its density /(*). 

39. If the life of tires is normal with mean 25 000 km and 
variance 25 000 000 km 2 , what is the probability that a 
given one of those tires will last at least 30 000 km? At 
least 35 000 km? 

40. If the weight of bags of cement is normal with mean 
50 kg and standard deviation 1 kg, what is the 
probability that 100 bags will be heavier than 5030 kg? 
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SUMM ARY OF CHATTrER: 2l4 

Data Analysis. Probability Theory 


A random experiment , briefly called experiment, is a process in which the result 
(“outcome”) depends on ‘‘chance” (effects of factors unknown to us). Examples are 
games of chance with dice or cards, measuring the hardness of steel, observing 
weather conditions, or recording the number of accidents in a city. (Thus the word 
“experiment” is used here in a much wider sense than in common language.) The 
outcomes are regarded as points (elements) of a set S, called the sample space, 
whose subsets are called events. For events E we define a probability P(E) by the 
axioms (Sec. 24.3) 

0 ^ P(E) g 1 

(1) P(S) = 1 

P(E ± U E 2 U • ■ •) = P(£ a ) + P(E 2 ) + • ■ • (Ej H E k = 0). 

These axioms are motivated by properties of frequency distributions of data 
(Sec. 24.1). 

The complement E c of E has the probability 

(2) P(E C ) = 1 - P(E). 

The conditional probability of an event B under the condition that an event A 
happens is (Sec. 24.3) 

. P(A n B) 

(3) P(B\A) = p( ~ [ P(A) > 0]. 

Two events A and B are called independent if the probability of their simultaneous 
appearance in a trial equals the product of their probabilities, that is, if 

(4) P{A r\B) = P(A)P(B). 

With an experiment we associate a random variable X. This is a function defined 
on S whose values are real numbers; furthermore, X is such that the probability 
P(X = a) with which X assumes any value a , and the probability P(ci < X ^ b) 
with which X assumes any value in an interval a < X ^ b are defined (Sec. 24.5). 
The probability distribution of X is determined by the distribution function 

(5) F(x) = P(X^x). 

In applications there are two important kinds of random variables: those of the 
discrete type, which appear if we count (defective items, customers in a bank, etc.) 
and those of the continuous type, which appear if we measure (length, speed, 
temperature, weight, etc.). 
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A discrete random variable has a probability function 

(6) f(x) = P(X = x). 

Its mean /jl and variance o* 2 are (Sec. 24.6) 

(7) fi = 2 Xjf(Xj) and a 2 = 2 (*j ~ m) 2 /(*j) 

where the Xj are the values for which X has a positive probability. Important discrete 
random variables and distributions are the binomial, Poisson, and hypergeometric 
distributions discussed in Sec. 24.7. 

A continuous random variable has a density 

(8) f(x) = F'(x) [see (5)]. 

Its mean and variance are (Sec. 24.6) 

(9) At = I xf(x) dx and a 2 = I (x - fiffix) dx. 

J -zo J -za 

Very important is the normal distribution (Sec. 24.8), whose density is 

(10) /w ■ “ p [“ 1 ffl] 

and whose distribution function is (Sec. 24.8; Tables A7, A8 in App. 5) 

(ID F(x) = . 

A two-dimensional random variable ( X, , Y) occurs if we simultaneously observe 
two quantities (for example, height X and weight Y of adults). Its distribution function 
is (Sec. 24.9) 

(12) F(x, y) =P(X^x,Y^ y). 

X and Y have the distribution functions (Sec. 24.9) 

(13) Fi(x) = P(X S x, Y arbitrary) and F 2 (y) = P(x arbitrary, Y § y) 

respectively; their distributions are called marginal distributions. If both X and Y 
are discrete, then (X, 7) has a probability function 

f(x, y) = P(X = x, Y = y). 

If both X and Y are continuous, then (X, Y ) has a density f(x, y ). 





CHAPTER 2 5 
Mathematical Statistics 


In probability theory we set up mathematical models of processes that are affected by 
“chance”. In mathematical statistics or. briefly, statistics, we check these models against 
the observable reality. This is called statistical inference. It is done by sampling, that 
is, by drawing random samples, briefly called samples. These are sets of values from a 
much larger set of values that could be studied, called the population. An example is 
10 diameters of screws drawn from a large lot of screws. Sampling is done in order to 
see whether a model of the population is accurate enough for practical purposes. If this 
is the case, the model can be used for predictions, decisions, and actions, for instance, in 
planning productions, buying equipment, investing in business projects, and so on. 

Most important methods of statistical inference are estimation of parameters 
(Secs. 25.2), determination of confidence intervals (Sec. 25.3), and hypothesis testing 
(Secs. 25.4, 25.7, 25.8), with application to quality control (Sec. 25.5) and acceptance 
sampling (Sec. 25.6). 

In the last section (25.9) we give an introduction to regression and correlation analysis, 
which concern experiments involving two variables. 

Prerequisite: Chap. 24. 

Sections that may be omitted in a shorter course: 25.5, 25.6, 25.8. 

References, Answers to Problems , and Statistical Tables : App. 1 Part G, App. 2. 
App. 5. 


25.1 Introduction. Random Sampling 

Mathematical statistics consists of methods for designing and evaluating random 
experiments to obtain information about practical problems, such as exploring the relation 
between iron content and density of iron ore, the quality of raw material or manufactured 
products, the efficiency of air-conditioning systems, the performance of certain cars, the 
effect of advertising, the reactions of consumers to a new product, etc. 

Random variables occur more frequently in engineering (and elsewhere) than one 
would think. For example, properties of mass-produced articles (screws, lightbulbs, etc.) 
always show random variation, due to small (uncontrollable!) differences in raw material 
or manufacturing processes. Thus the diameter of screws is a random variable X and we 
have nondefective screws , with diameter between given tolerance limits, and defective 
screws , with diameter outside those limits. We can ask for the distribution of X , for the 
percentage of defective screws to be expected, and for necessary improvements of the 
production process. 
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EXAMPLE 1 


Samples are selected from populations — 20 screws from a lot of 1000, 100 of 5000 
voters, 8 beavers in a wildlife conservation project — because inspecting the entire 
population would be too expensive, time-consuming, impossible or even senseless (think 
of destructive testing of lightbulbs or dynamite). To obtain meaningful conclusions, 
samples must be random selections. Each of the 1 000 screws must have the same chance 
of being sampled (of being drawn when we sample), at least approximately. Only then 
will the sample mean x = (x x + • * • + a' 2O )/20 (Sec. 24.1) of a sample of size n = 20 
(or any other /?) be a good approximation of the population mean jjl (Sec. 24.6); and the 
accuracy of the approximation will generally improve with increasing n. as we shall see. 
Similarly for other parameters (standard deviation, variance, etc.). 

Independent sample values will be obtained in experiments with an infinite sample 
space S (Sec. 24.2), certainly for the normal distribution. This is also true in sampling with 
replacement. It is approximately true in drawing small samples from a large finite population 
(for instance, 5 or 10 of 1000 items). However, if we sample without replacement from a 
small population, the effect of dependence of sample values may be considerable. 

Random numbers help in obtaining samples that are in fact random selections. This 
is sometimes not easy to accomplish because there are many subtle factors that can bias 
sampling (by personal interviews, by poorly working machines, by the choice of nontypical 
observation conditions, etc.). Random numbers can be obtained from a random number 
generator in Maple, Mathematica, or other systems listed on p. 991. (The numbers are 
not truly random, as they would be produced in flipping coins or rolling dice, but are 
calculated by a tricky formula that produces numbers that do have practically all the 
essential features of true randomness.) 

Random Numbers from a Random Number Generator 

To select a sample of size n = 10 from 80 given ball bearings, we number the bearings from I to 80. We then 
let the generator randomly produce 10 of the integers from 1 to 80 and include the bearings with the numbers 
obtained in our sample, for example. 

44 55 53 03 52 61 67 78 39 54 

or whatever. 

Random numbers are also contained in (older) statistical tables. El 


Representing and processing data were considered in Sec. 24.1 in connection with 
frequency distributions. These are the empirical counterparts of probability distributions 
and helped motivating axioms and properties in probability theory. The new aspect in this 
chapter is randomness: the data are samples selected randomly from a population. 
Accordingly, we can immediately make the connection to Sec. 24.1, using stem-and-leaf 
plots, box plots, and histograms for representing samples graphically. 

Also, we now call the mean x in (5), Sec. 24.1, the sample mean 


( 1 ) 


1 A 1 

X = — 2j x j = — + *2 + 


n 




n 


+ *«)• 


We call n the sample size, the variance s 2 in (6), Sec. 24.1, the sample variance 

52 = ~ _ j 2 O 9 _ x) z = ~ ~ j [U‘i - x) 2 + • • • + (x n - *) 2 ], 
j=l 
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and its positive square root s the sample standard deviation, x, s 2 , and s are called 
parameters of a sample ; they will be needed throughout this chapter. 


25.2 Point Estimation of Parameters 

Beginning in this section, we shall discuss the most basic practical tasks in statistics and 
corresponding statistical methods to accomplish them. The first of them is point estimation 
of parameters, that is, of quantities appearing in distributions, such as p in the binomial 
distribution and p and cr in the normal distribution. 

A point estimate of a parameter is a number (point on the real line), which is computed 
from a given sample and serves as an approximation of the unknown exact value of the 
parameter of the population. An interval estimate is an interval (“ confidence internal”) 
obtained from a sample; such estimates will be considered in the next section. Estimation 
of parameters is of great practical importance in many applications. 

As an approximation of the mean /x of a population we may take the mean x of a 
corresponding sample. This gives the estimate p = x for p , that is. 


( 1 ) 


P = X = — (*! + • • • + X n ) 


where n is the sample size. Similarly, an estimate a 2 for the variance of a population is 
the variance s 2 of a corresponding sample, that is. 


( 2 ) 


<T 


2 = . 9 2 = 


n — 1 


2 (.Xj - x) 2 




Clearly, (1) and (2) are estimates of parameters for distributions in which p or a 2 
appear explicity as parameters, such as the normal and Poisson distributions. For the 
binomial distribution, p = pin [see (3) in Sec. 24.7 1. From (1) we thus obtain for/? 
the estimate 


(3) p = — . 

n 

We mention that (1) is a special case of the so-called method of moments. In this 
method the parameters to be estimated are expressed in terms of the moments of the 
distribution (see Sec. 24.6). In the resulting formulas those moments of the distribution 
are replaced by the corresponding moments of the sample. This gives the estimates. Here 
the Arth moment of a sample jc 1s • • ■ , x n is 
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EXAMPLE 1 


Maximum Likelihood Method 

Another method for obtaining estimates is the so-called maximum likelihood method of 
R. A. Fisher [Messenger Math. 41 (1912), 155-160]. To explain it, we consider a discrete 
(or continuous) random variable X whose probability function (or density) f(x) depends 
on a single parameter 0. We take a corresponding sample of n independent values 
.\*i, • • * . x n . Then in the discrete case the probability that a sample of size n consists 
precisely of those n values is 

( 4 ) / = f(x x )f(x 2 ) • • • f(x n ). 


In the continuous case the probability that the sample consists of values in the small 
intervals Xj ^ x ^ Xj H- Aa* (./ = 1 , 2, • • • , /?) is 

(5) /(ai)A.v f(x 2 ) Ax • • • /( x n )Ax = i(Ax) n . 

Since f(xj) depends on 0 , the function / in (5) given by (4) depends on jt x , • ■ • , x n and 
6. We imagine x v • • • . x n to be given and fixed. Then / is a function of 0 , which is called 
the likelihood function. The basic idea of the maximum likelihood method is quite simple, 
as follows. We choose that approximation for the unknown value of 0 for which / is as 
large as possible. If / is a differentiable function of 0 , a necessary condition for / to have 
a maximum in an interval (not at the boundary) is 


(We write a partial derivative, because / depends also on a* 1? • • • , x n .) A solution of (6) 
depending on ,v ls • • • , x n is called a maximum likelihood estimate for 0. We may replace 
(6) by 


( 7 ) 


d In / 
00 


= 0 , 


because f(xj) > 0, a maximum of / is in general positive, and In / is a monotone increasing 
function of /. This often simplifies calculations. 


Several Parameters. If the distribution of X involves /* parameters 0 U • * • , 0^ then 
instead of (6) we have the r conditions dl/d0 1 = 0, • • • , dl/c)0 r = 0, and instead of (7) 
we have 


( 8 ) 


d In/ 
dd l 


c) I n / 
d0 r 


Normal Distribution 

Find maximum likelihood estimates for $i - p and 0 2 = 0" in the case of the normal distribution. 
Solution . From (1). Sec. 24.8, and (4) we obtain the likelihood function 
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Taking logarithms, we have 

In / = — n In \Z2rr - n In cr — h. 
The first equation in (8) is d(ln !)lbp = 0, written out 

hence 


a In/ bh 1 ” 

— - — = - — = — Uj - m) = o, 

(Ip L dpi (T “j J 


The solution is the desired estimate p for p: we find 


2 -1(1 = 0. 


A= - 2 Aj = 1 


" j-i 


The second equation in (8) is a (In l)lbcr = 0. written out 
a In / 


< lo- 


ti bh 

cr bcr 


!L 1 

(T < 7 ° 


u an ii x 0 

= = + jl (Aj- - m) = 0. 


J“1 


Replacing /x by /land solving for (7 2 3 4 5 6 7 , we obtain the estimate 


_ 2 


<r 2 = ~ 2 Cty “ -v) 


which we shall use in Sec. 25.7. Note that this differs from (2). We cannot discuss criteria for the goodness of 
estimates but want to mention that for small w, formula (2) is preferable. M 




1. Find the maximum likelihood estimate for the 
parameter p of a normal distribution with known 
variance cr 2 = cr 0 2 

2. Apply the maximum likelihood method to the normal 
distribution with p = 0. 

3. (Binomial distribution) Derive a maximum likelihood 
estimate for p. 

4. Extend Prob. 3 as follows. Suppose that m times n 
trials were made and in the first n trials A happened 
k x times, in the second n trials A happened k 2 times, 
* • • , in the wth n trials A happened fc, n times. Find a 
maximum likelihood estimate of p based on this 
information. 

5. Suppose that in Prob. 4 we made 4 times 5 trials and 
A happened 2, 1 , 4, 4 times, respectively. Estimate p. 

6. Consider X = Number of independent trials until an 
event A occurs. Show that X has the probability 
function f(x) = pq x ~ l < x = 1 , 2, • • • , where p is the 
probability of A in a single trial and q = 1 — p. Find 
the maximum likelihood estimate of p corresponding 
to a sample .v T , • • • , x n of observed values of X. 

7. In Prob. 6 find the maximum likelihood estimate of p 
corresponding to a single observation _v of X. 


8. In rolling a die, suppose that we get the first Six in the 
7th trial and in doing it again we get it in the 6th trial. 
Estimate the probability p of getting a Six in rolling 
that die once. 

9. (Poisson distribution) Apply the maximum likelihood 
method to the Poisson distribution. 

10. (Uniform distribution) Show that in the case of the 
parameters a and b of the uniform distribution (see 
Sec. 24.6), the maximum likelihood estimate cannot be 
obtained by equating the first derivative to zero. How 
can we obtain maximum likelihood estimates in this 
case? 

11. Find the maximum likelihood estimate of 0 in the 
density f{x) = $e~ 0x if x ^ 0 and /( x) = 0 if x < 0. 

12. In Prob. 11, find the mean p, substitute it in /(.v), find 
the maximum likelihood estimate of /x. and show that 
it is identical with the estimate for p which can be 
obtained from that for 0 in Prob. 1 1 . 

13. Compute 0 in Prob. 11 from the sample 1.8, 0.4. 0.8, 
0.6, 1 .4. Graph the sample distribution function F( x) 
and the distribution function F(x) of the random 
variable, with 0 = 0, on the same axes. Do they agree 
reasonably well? (We consider goodness of fit 
systematically in Sec. 25.7.) 
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14. Do the same task as in Prob. 13 if the given sample is 
0.5, 0.7, 0.1, 1.1. 0.1. 

15. CAS EXPERIMENT. Maximum Likelihood 
Estimates. (MLEs). Find experimentally how much 


MLEs can differ depending on the sample size. Hint. 
Generate many samples of the same size n, e.g., of the 
standardized normal distribution, and record x and s 2 . 
Then increase n. 


253 Confidence Intervals 

Confidence intervals 1 for an unknown parameter 0 of some distribution (e.g., 0 = fx) are 
intervals Q 1 ^ 0 ^ 0 2 that contain 0 , not with certainty but with a high probability y, 
which we can choose (95% and 99% are popular). Such an interval is calculated from a 
sample, y = 95% means probability 1 — y = 5% = 1/20 of being wrong — one of about 
20 such intervals will not contain 0. Instead of writing ^ 0 ^ 0 2 , we denote this more 
distinctly by writing 

(1) CONF y {0 l ^ ^ e 2 \. 

Such a special symbol, CONF, seems worthwhile in order to avoid the misunderstanding 
that 6 must lie between 0 X and 0 2 . 

y is called the confidence level, and 0 1 and 0 2 are called the lower and upper 
confidence limits. They depend on y. The larger we choose y, the smaller is the error 
probability 1 — y, but the longer is the confidence interval. If y— » 1, then its length goes 
to infinity. The choice of y depends on the kind of application. In taking no umbrella, a 
5% chance of getting wet is not tragic. In a medical decision of life or death, a 5% chance 
of being wrong may be too large and a 1% chance of being wrong (y = 99%) may be 
more desirable. 

Confidence intervals are more valuable than point estimates (Sec. 25.2). Indeed, we can 
take the midpoint of (1) as an approximation of 0 and half the length of (1) as an “error 
bound” (not in the strict sense of numerics, but except for an error whose probability we 
know). 

0j and 0 2 in (1 ) are calculated from a sample a* x , • • • , x n . These are n observations of 
a random variable X. Now comes a standard trick. We regard x u • • • , x n as single 
observations of n random variables X lt • • • , X n ( with the same distribution , namely , that 
ofX). Then 0j = 0 1 (a* 1 , • • • , x n ) and 0 2 = 0 2 (a 1? • * • , a* ? 1 ) in (1) are observed values of 
two random variables Q 1 = Qi(X v — * , X n ) and 0 2 = 0 2 (X lf • • ■ . X v ). The condition 

(1) involving yean now be written 

(2) P(Q ! ^ 0 ^ 0 2 ) = y. 

Let us see what all this means in concrete practical cases. 

In each case in this section we shall first state the steps of obtaining a confidence interval 
in the form of a table, then consider a typical example, and finally justify those steps 
theoretically. 


1 JERZV NEYMAN (1894-1981). American statistician, developed the theory of confidence intervals ( Armais 

of Mathematical Statistics 6 (1935). 111-116). 
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EXAMPLE 1 


THEOREM 1 


PROOF 


Confidence Interval for p of the Normal Distribution 
with Known a 2 


(3) 


Table 25.1 Determination of a Confidence Interval for the Mean pc of 
a Normal Distribution with Known Variance <r 2 

Step 1. Choose a confidence level y (95%, 99%, or the like). 

Step 2. Determine the corresponding c: 


y 

0.90 

0.95 

0.99 

0.999 

c 

1.645 

1.960 

2.576 

3.291 


Step 3. Compute the mean x of the sample x v • • • , x n . 

Step 4 . Compute k = ccr/VJi. The confidence interval for fx is 

CONF y {* - * + *}. 


Confidence Interval for p of the Normal Distribution with Known cr 2 

Detcrimine a 95% confidence interval for the mean of a normal distribution with variance a 2 = 9, using a 
sample of n * 100 values with mean x = 5. 

Solution . Step 1. y = 0.95 is required. Step 2. The corresponding c equals 1.960; see Table 25.1. 
Step 3. x = 5 is given. Step 4. We need k = 1.960 • 3/VT00 = 0.588. Hence .v - k = 4.412. x + k = 5.588 
and the confidence interval is CONF 0 95 {4.412 ^ p ^ 5.588}. 

This is sometimes written p - 5 ± 0.588, but we shall not use this notation, which can be misleading. 

With your CAS you can determine this interval more directly. Similarly for the other examples in this section. ■ 

Theory for Table 25.1. The method in Table 25.1 follows from the basic 


Sum of Independent Normal Random Variables 

Let X x , • • • , X n be independent normal random variables each of which has mean 
jji and variance a 2 . Then the following holds. 

(a) The sum X x 4* • • • + X n is normal with mean nji and variance ncr 2 . 

(b) The following random variable X is normal with mean /jl and variance cr 2 /n . 

(4) X = ^ (X x + • • • + X n ) 


(c) The following random variable Z is normal with mean 0 and variance 1. 


(5) 


Z = 


X - ix 


dl\fn 


The statements about the mean and variance in (a) follow from Theorems 1 and 3 in 
Sec. 24.9. From this and Theorem 2 in Sec. 24.6 we see that X has the mean (\in)nfx = fi 
and the variance (1 /n) 2 na 2 = cr 2 //?. This implies that Z has the mean 0 and variance 1, 
by Theorem 2(b) in Sec. 24.6. The normality of X 1 + • • • + is proved in Ref. [G3] 
listed in App. I. This implies the normality of (4) and (5). ■ 
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EXAMPLE 2 


Derivation of (3) in Table 25.1. Sampling from a normal distribution gives independent 
sample values (see Sec. 25.1), so that Theorem 1 applies. Hence we can choose y and 
then determine c such that 

^ cj = 3>(c) — <f>(— c) = y. 

For the value y = 0.95 we obtain z(D) = 1.960 from Table A8 in App. 5, as used in 
Example 1. For y = 0.9, 0.99. 0.999 we get the other values of c listed in Table 25.1. 
Finally, all we have to do is to convert the inequality in (6) into one for /jl and insert 
observed values obtained from the sample. We multiply — c^Z^c by— 1 and then by 
cr!\fn , writing cafVn = k (as in Table 25.1), 


( 6 ) 


P(-c ^ Z ^ c) = P g — 


P 


a/y/n 


P(-c ^ Z ^ c) = P(c ^ -Z ^ -c) = P 



cr/Vn 


= P(k ^ - X ^ -k) = y. 


Adding X gives P(X + k^iJL^X-k)= y or 

(7) P(X - k^ fJL^X + k) = y. 

Inserting the observed value x of X gives (3). Here we have regarded x ly • • • , x n as single 
observations of X l9 * • * , X n (the standard trick!), so that*! + • • • + x n is an observed 
value of X x + • • • +_X n and x is an observed value of X. Note further that (7) is of the 
form (2) with 0 X = X — k and 0 2 = X 4- L ■ 


Sample Size Needed for a Confidence Interval of Prescribed Length 

How large must n be in Example 1 if we want to obtain a 95% confidence interval of length L = 0.4? 
Solution . The interval (3) has the length L = 2k = IcgTJTi. Solving for n, we obtain 

n = (2 caiLf. 

In the present case the answer is n = (2 • 1 .960 • 3/0.4) 2 «= 870. 

Figure 525 shows how L decreases as n increases and that for y = 99% the confidence interval is substantially 
longer than for y = 95% (and the same sample size n). M 



Fig. 525. Length of the confidence interval (3) (measured in multiples of o) 
as a function of the sample size n for y = 95% and y = 99% 
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Confidence Interval for fi of the Normal Distribution 
With Unknown cr 2 

In practice cr 2 is frequently unknown. Then the method in Table 25.1 does not help and 
the whole theory changes, although the steps of determining a confidence interval for p. 
remain quite similar. They are shown in Table 25.2. We see that k differs from that in 
Table 25.1, namely, the sample standard deviation s has taken the place of the unknown 
standard deviation cr of the population. And c now depends on the sample size n and must 
be determined from Table A9 in App. 5 or from your CAS. That table lists values z for 
given values of the distribution function (Fig. 526) 

f z / l{ 2 \-(m+ 1)/2 

(8) F(z) = ^/ ae ( 1 + -) du 

of the /-distribution. Here, m (= 1, 2, • • •) is a parameter, called the number of degrees 
of freedom of the distribution ( abbreviated d.f.). In the present case, 
m = n — 1; see Table 25.2. The constant Km Is such that F(o°) = I. By integration it 
turns out that K m = + |)/[V/ 777 rr(| m)J, where T is the gamma function (see (24) 

in App. A3.1). 


(9) 


( 10 ) 


Table 2S.2 Determination of a Confidence Interval for the Mean /£ 
of a Normal Distribution with Unknown Variance o * 2 

Step 1. Choose a confidence level y (95%, 99%, or the like). 

Step 2 . Determine the solution c of the equation 

F(c)=±( 1 + y) 

from the table of the /-distribution with n — 1 degrees of freedom 
(Table A9 in App. 5; or use a CAS; n = sample size). 

Step 5. Compute the mean x and the variance s 2 of the sample 

• • • * v n - 

Step 4 . Compute k = cs/'s/n. The confidence interval is 
CONF y {x - k ^ fi ^ x + k). 



Fig. 526. Distribution functions of the f- 
distribution with 1 and 3 d.f. and of the 
standardized normal distribution (steepest curve) 



Fig. 527. Densities of the /-distribution 
with 1 and 3 d.f. and of the standardized 
normal distribution 
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EXAMPLE 3 


THEOREM 2 


Figure 527 compares the curve of the density of the f-distribution with that of the normal 
distribution. The latter is steeper. This illustrates that Table 25.1 (which uses more 
information, namely, the known value of a 2 ) yields shorter confidence intervals than Table 
25.2. This is confirmed in Fig. 528, which also gives an idea of the gain by increasing 
the sample size. 



Fig. 528. Ratio of the lengths L ' and L of the confidence 
intervals (10) and (3) with y = 95% and y = 99% as a function 
of the sample size n for equal s and a 

Confidence Interval for /a of the Normal Distribution with Unknown a 2 

Five independent measurements of the point of inflammation (flash point) of Diesel oil (D-2) gave the values 
(in °F) 144 147 146 142 144. Assuming normality, determine a 99% confidence interval for the mean. 

Solution . Step L y = 0.99 is required. 

Step 2. F(c) = ^(1 + y) = 0.995. and Table A9 in App. 5 with n — l = 4 d.f. gives c = 4.60. 

Step 3. x = 144.6, s 2 = 3.8. 

Step 4. k = VTI -4.60/V5 = 4.01. The confidence interval is CONF 0.99 ( 140.5 SfiS 148.7]. 

if the variance cr 2 were known and equal to the sample variance s 2 , thus cr 2 = 3.8, then Table 25.1 would 
give k = catVn = 2.576VI8/V5 = 2.25 and CONF 0 . 99 ( 142.35 £ p£ 146.85). We see that the present 
interval is almost twice as long as that obtained from Table 25.1 (with it 2 = 3.8). Hence for small samples the 
difference is considerable! See also Fig. 528. ■ 


Theory for Table 25.2. For deriving (10) in Table 25.2 we need from Ref. [G3] 


Student’s t-Distribution 

Let Xi, ••• , X n be independent normal random variables with the same mean p 
and the same variance cr 2 . Then the random variable 


(ID 


X - p. 

T ~ S/Vn 


has a t-distribution [see (8)] with n - 1 degrees of freedom (d.f.); here X is given 
by (4) and 

( 12 ) S z 2 (X; - X) 2 . 

n ~ 1 j-i 
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EXAMPLE 4 


Derivation of (10). This is similar to the derivation of (3). We choose a number y 
between 0 and 1 and determine a number c from Table A9 in App. 5 with n — 1 d.f. (or 
from a CAS) such that 

(13) P(-c ^T^c) = F(c) - F(-c) = y. 

Since the f-distribution is symmetric, we have 

Ft-c) = 1 ~ F(c\ 

and (13) assumes the form (9). Substituting (11) into (13) and transforming the result as 
before, we obtain 

(14) P(X — K^/ji^X + K) = y 
where 

K = cS/Vn. 

By inserting the observed values x of X and .v 2 of S 2 into (14) we finally obtain (10). ■ 

Confidence Interval for the Variance cr 2 
of the Normal Distribution 

Table 25.3 shows the steps, which are similar to those in Tables 25.1 and 25.2. 


Table 25.3 Determination of a Confidence interval for the Variance 
a 2 of a Normal Distribution, Whose Mean Need Not Be Known 

Step 1. Choose a confidence level y (95%, 99%. or the like). 

Step 2. Determine solutions c y and c 2 of the equations 

(15) F( Cl ) = |(1 - y), F(c 2 ) = |(1 + y) 

from the table of the chi-square distribution with n - 1 degrees of 
freedom (Table A10 in App. 5: or use a CAS; n = sample size). 

Step 3 . Compute (n — 1 ).v 2 , where s 2 is the variance of the sample 

Ai, * • * , A* w . 

Step 4 . Compute k\ = (/? — 1 )s 2 /c x and k 2 = (n - l )s 2 /c 2 . The 
confidence interval is 

(16) CONF y [k 2 ^ a* 2 ^ A^}. 


Confidence Interval for the Variance of the Normal Distribution 

Determine a 95% confidence interval (16) for the variance, using Table 25.3 and a sample (tensile strength of 
sheet steel in kg/mm 2 , rounded to integer values) 


89 84 87 81 89 86 91 90 78 89 87 99 83 89. 
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Solution, Step 1. y = 0.95 is required. 

Step 2. For n - 1 = 1 3 we find 

^ = 5.01 and c 2 = 24.74. 

Step 3, 13s 2 = 326.9. 

Step 4. 13 s 2 /cj = 65.25, 13.v 2 /c 2 = 13.21. 

The confidence interval is 

CONF 0i 95 ( 13.21 ^ o- 2 ^ 65.251. 

This is rather large, and for obtaining a more precise result, one would need a much larger sample. H 

Theory for Table 25.3. In Table 25.1 we used the normal distribution, in Table 25.2 
the f-distribution, and now we shaLl use the ^-distribution ( chi-square distribution), 
whose distribution function is F(z) = 0 if z < 0 and 

F(z) = C m \ rt M/ 2 du \f z = 0 (Fig. 529). 

J o 



Fig. 529. Distribution function of the chi-square distribution with 2, 3, 5 d.f. 


The parameter m (= 1, 2, • • •) is called the number of degrees of freedom (d.f.), and 

C m = l/[2” l/2 r(|m)]. 

Note that die distribution is not symmetric (see also Fig. 530). 

For deriving (16) in Table 25.3 we need the following theorem. 


THEOREM 3 


Chi-Square Distribution 

Under the assumptions in Theorem 2 the random variable 

S 2 

(17) y = (/z - 1) _ 

with S 2 given by (12) has a chi-square distribution with n - 1 degrees of freedom. 


Proof in Ref. [G3], listed in App. 1. 
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Derivation of (16). This is similar to the derivation of (3) and (10). We choose a 
number y between 0 and 1 and determine c x and c 2 from Table A10, App. 5, such that 
[see (15)] 


P(Y S§ Cl) = F(ci) = 1(1 - r), P(Y S c 2 ) = F(c 2 ) = 1(1 + 7). 


Subtraction yields 


P( Cl ^Y^c 2 ) = P{Y ^ c 2 ) - P(Y ^ c x ) = F(c 2 ) - F(c x ) = y. 

Transforming c x ^ Y ^ c 2 with Y given by (17) into an inequality for <x 2 , we obtain 

n - 1 9 9 _ n - 1 9 

S 2 ^ a 2 ^ S 2 . 

C 2 Cj 

By inserting the observed value s 2 of S' 2 we obtain (16). 


Confidence Intervals for Parameters 
of Other Distributions 

The methods in Tables 25.1-25.3 for confidence intervals for fx and cr 2 are designed for 
the normal distribution. We now show that they can also be applied to other distributions 
if we use large samples. 

We know that if X ± , • • • , are independent random variables with the same mean /jl 
and the same variance cr 2 , then their sum Y n = X x 4- • • • 4- X n has the following properties. 

(A) Y n has the mean nfi and the variance ncr 2 (by Theorems 1 and 3 in Sec. 24.9). 

(B) If those variables are normal, then Y n is normal (by Theorem 1). 

If those random variables are not normal, then (B) is not applicable. However, for large 
n the random variable Y n is still approximately normal. This follows from the central limit 
theorem, which is one of the most fundamental results in probability theory. 
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THEOREM 4 


Central Limit Theorem 

Let X 1% • • • , X n , • • • be independent random variables that have the same 
distribution function and therefore the same mean p and the same variance a 2 . Let 
Y n = Xi + • • • + X n . Then the random variable 


( 18 ) 



is asymptotically normal with mean 0 and variance 1; that is , the distribution 
function F n {x) of Z n satisfies 


lim F n (x) = <£(*) = 


1 


V2tt 



A proof can be found in Ref. [G3] listed in App. 1. 

Hence when applying Tables 25.1-25.3 to a nonnormal distribution, we must use 
sufficiently large samples. As a rule of thumb, if the sample indicates that the skewness 
of the distribution (the asymmetry: see Team Project 16(d), Problem Set 24.6) is small, 
use at least n = 20 for the mean and at least n = 50 for the variance. 


^ 


[W7] MEAN (VARIANCE KNOWN) 

1. Find a 95% confidence interval for the mean p of a 
normal population with standard deviation 4.00 from 
the sample 30, 42, 40, 34, 48, 50. 

2. Does the interval in Prob. 1 get longer or shorter if we 
take y = 0.99 instead of 0.95? By what factor? 

3. By what factor does the length of the interval in Prob. 1 
change if we double the sample size? 

4. Find a 90% confidence interval for the mean p of a 
normal population with variance 0.25, using a sample 
of 100 values with mean 212.3. 


MEAN (VARIANCE UNKNOWN) 

Find a 99% confidence interval for the mean of a normal 
population from the sample: 


&-I2 


8. 425, 420, 425, 435 


9. Length of 20 bolts with sample mean 20.2 cm and 
sample variance 0.04 cm 2 

10. Knoop hardness of diamond 9500, 9800, 9750, 9200, 
9400, 9550 


11. Copper content (%) of brass 66, 66, 65, 64, 66, 67, 64, 
65, 63, 64 

12. Melting point (°C) of aluminum 660, 667, 654, 663, 662 


5. What sample size would be needed for obtaining a 95% 
confidence interval (3) of length 2(7? Of length crl 

6. (Use of Fig. 525) Find a 95% confidence interval for 
a sample of 200 values with mean 1 20 from a normal 
distribution with variance 4, using Fig. 525. 

7. What sample size is needed to obtain a 99% confidence 
interval of length 2.0 for the mean of a normal 
population with variance 25? Use Fig. 525. Check by 
calculation. 


13. Find a 95% confidence interval for the percentage of 
cars on a certain highway that have poorly adjusted 
brakes, using a random sample of 500 cars stopped at 
a roadblock on that highway, 87 of which had poorly 
adjusted brakes. 

14. Find a 99% confidence interval for p in the binomial 
distribution from a classical result by K. Pearson, who 
in 24000 trials of tossing a coin obtained 12012 Heads. 
Do you think that the coin was fair? 
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15-20 


VARIANCE 


Find a 95% confidence interval for the variance of a normal 
population from the sample: 

15. A sample of 30 values with variance 0.0007 

16. The sample in Prob. 9 

17. The sample in Prob. 1 1 

18. Carbon monoxide emission (grams per mile) of a 
certain type of passenger car (cruising at 55 mph): 


17.3. 17.8, 18.0. 17.7, 18.2, 17.4. 17.6, 18.1 


19. Mean energy (keV) of delayed neutron group (Group 
3, half-life 6.2 sec.) for uranium U 235 fission: 435, 451, 
430, 444, 438 

20. Ultimate tensile strength (k psi) of alloy steel 
(Maraging H) at room temperature: 25 1, 255, 258, 253, 
253, 252, 250, 252, 255, 256 


21. If X is normal with mean 27 and variance 16, what 
distributions do — X, 3X, and 5X — 2 have? 

22. If Xi and X 2 are independent normal random variables 


with mean 23 and 4 and variance 3 and 1 , respectively, 
what distribution does 4X 1 — X 2 have? Hint. Use Team 
Project 14(g) in Sec. 24.8. 

23. A machine fills boxes weighing Y lb with X lb of salt, 
where X and Y are normal with mean 100 lb and 5 lb 
and standard deviation l lb and 0.5 lb, respectively. 
What percent of filled boxes weighing between 104 lb 
and 106 lb are to be expected? 

24. If the weight X of bags of cement is normally 
distributed with a mean of 40 kg and a standard 
deviation of 2 kg, how many bags can a delivery truck 
carry so that the probability of the total load exceeding 
2000 kg will be 5%? 

25. CAS EXPERIMENT. Confidence Intervals. Obtain 
100 samples of size 10 of the standardized normal 
distribution. Calculate from them and graph the 
corresponding 95% confidence intervals for the mean 
and count how many of them do not contain 0. Does 
the result support the theoiy? Repeat the whole 
experiment, compare and comment. 


25.4 Testing of Hypotheses. Decisions 

The ideas of confidence intervals and of tests 2 are the two most important ideas in modem 
statistics. In a statistical test we make inference from sample to population through testing 
a hypothesis, resulting from experience or observations, from a theory or a quality 
requirement, and so on. In many cases the result of a test is used as a basis for a decision, 
for instance, to buy (or not to buy) a certain model of car, depending on a test of the fuel 
efficiency (miles/gal) (and other tests, of course), to apply some medication, depending 
on a test of its effect; to proceed with a marketing strategy, depending on a test of consumer 
reactions, etc. 

Let us explain such a test in terms of a typical example and introduce the corresponding 
standard notions of statistical testing. 

EXAMPLE 1 Test of a Hypothesis. Alternative. Significance Level a 

We want to buy 100 coils of a certain kind of wire, provided we can verify the manufacturer’s claim that the 
wire has a breaking limit m = Mo = 200 lb (or more). This is a test of the hypothesis (also called null hypothesis ) 
/x = /xo = 200. We shall not buy the wire if the (statistical) test shows that actually p — p\< Mo- the wire is 
weaker, the claim does not hold, mi is called the alternative (or alternative hypothesis) of the test. We shall 
accept the hypothesis if the test suggests that it is true, except for a small error probability a, called the 
significance level of the test. Otherwise we reject the hypothesis. Hence a is the probability of rejecting a 
hypothesis although it is true. The choice of a is up to us. 5% and 1 % are popular values. 

For the test we need a sample. We randomly select 25 coils of the wire, cut a piece from each coil, and 
determine the breaking limit experimentally. Suppose that this sample of n = 25 values of the breaking limit 
has the mean .v = 197 lb (somewhat less than the claim!) and the standard deviation 5 = 6 lb. 


beginning around 1930, a systematic theory of tests was developed by NEYMAN (see Sec. 25.3) and EGON 
SHARPE PEARSON (1895-1980), English statistician, the son of Karl Pearson (see the footnote on p. 1066). 
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At this point we could only speculate whether this difference 197 — 200 = —3 is due to randomness, is a 
chance effect, or whether it is significant, due to the actually inferior quality of the wire. To continue beyond 
speculation requires probability theory, as follows. 

We assume that the breaking limit is normally distributed. (This assumption could be tested by the method 
in Sec. 25.7. Or we could remember the central limit theorem (Sec. 25.3) and take a still larger sample.) Then 

rp 

SA/n 

in (1 1), Sec. 25.3, with fju = /Xq has a /-distribution with n - I degrees of freedom (// - I = 24 for our sample). 
Also x - 197 and s = 6 are observed values of X and S to be used later. We can now choose a significance 
level, say. a = 5%. From Table A9 in App. 5 or from a CAS we then obtain a critical value c such that 
P(T c) = a = 5%. For P(T ^ ?) = 1 - a = 95% the table gives? = 1.71, so that c = - c = - 1.71 because 
of the symmetry of the distribution (Fig. 531). 

We now reason as follows — this is the crucial idea of the test. If the hypothesis is true, we have a chance of 
only a (= 5%) that we observe a value / of T (calculated from a sample) that will fall between and -1.71. 
Hence if we nevertheless do observe such a /, we assert that the hypothesis cannot be true and we reject it. Then 
we accept the alternative. If, however, t ^ c, we accept the hypothesis. 

A simple calculation finally gives t = (197 — 200)/(6/V25) = -2.5 as an observed value of T. Since 
-2.5 < - 1.71, we reject the hypothesis (the manufacturer's claim) and accept the alternative 200, 

the wire seems to be weaker than claimed. H 



Fig. 531. t-distribution in Example 1 


This example illustrates the steps of a test: 

1. Formulate the hypothesis 0 = 0 O to be tested. (0o = Mo * n the example.) 

2. Formulate an alternative 0 = 0 X . (0j = jx x in the example.) 

3. Choose a significance level a (5%, 1%, 0.1%). 

4 . Use a random variable 9 = g(X x , • • • , X n ) whose distribution depends on the 
hypothesis and on the alternative, and this distribution is known in both cases. Determine 
a critical value c from the distribution of 0, assuming the hypothesis to be true. (In the 
example, 0=7’. and c is, obtained from P{T = c) = a.) 

5. Use a sample x lf • * • , x n to determine an observed value 0 = g(A' 1? ••• , x n ) of 0. 
(r in the example.) 

6. Accept or reject the hypothesis, depending on the size of 0 relative to c. (t < c in 
the example, rejection of the hypothesis.) 

Two important facts require further discussion and careful attention. The first is the 
choice of an alternative. In the example, < p 0f but other applications may require 
Mi > Mo or Mi * The second fact has to do with errors. We know that a (the 
significance level of the test) is the probability of rejecting a true hypothesis. And we 
shall discuss the probability j 6 of accepting a false hypothesis. 
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One-Sided and Two-Sided Alternatives (Fig. 532) 

Let 0 be an unknown parameter in a distribution, and suppose that we want to test the 
hypothesis 0 = 0 O . Then there are three main kinds of alternatives, namely, 


(1) 

0> 0 O 

(2) 

e< 00 

(3) 

9 # 9q. 


(1) and (2) are one-sided alternatives, and (3) is a two-sided alternative. 

We call rejection region (or critical region) the region such that we reject the 
hypothesis if the observed value in the test falls in this region. In (D the critical c lies to 
the right of 0 O because so does the alternative. Hence the rejection region extends to the 
right. This is called a right-sided test. In ® the critical c lies to the left of 0 O (as in 
Example 1), the rejection region extends to the left, and we have a left-sided test 
(Fig. 532, middle part). These are one-sided tests. In © we have two rejection regions. 
This is called a two-sided test (Fig. 532, lower part). 

All three kinds of alternatives occur in practical problems. For example, (1) may arise 
if 0 O is die maximum tolerable inaccuracy of a voltmeter or some other instrument. 
Alternative (2) may occur in testing strength of material, as in Example 1. Finally, 0 O in 
(3) may be the diameter of axle-shafts, and shafts that are too thin or too thick are equally 
undesirable, so that we have to watch for deviations in both directions. 


© 

© 


Acceptance Region 

Rejection Region 

Do not reject hypothesis 

(Critical Region) 

(Accept hypothesis) 

1 

Reject hypothesis 

e o 



c 


Acceptance Region 
Do not reject hypothesis 
(Accept hypothesis) 


o 


Rejection Region 



c 


Acceptance Region 

Rejection Region Do not reject Rejection Region 

(Critical Region) hypothesis (Critical Region) 

Reject hypothesis I (Accept hypothesis) I Reject hypothesis 



Fig. 532. Test in the case of alternative (1) (upper part of the figure), alternative 
(2) (middle part), and alternative (3) 


Errors in Tests 

Tests always involve risks of making false decisions: 

(I) Rejecting a true hypothesis (Type I error). 
a = Probability of making a Type I error. 

(II) Accepting a false hypothesis (Type II error). 

£ = Probability of making a Type n error. 
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Clearly, we cannot avoid these errors because no absolutely certain conclusions about 
populations can be drawn from samples. But we show that there are ways and means of 
choosing suitable levels of risks, that is, of values a and /3. The choice of a depends on 
the nature of the problem (e.g., a small risk a = 1% is used if it is a matter of life or 
death). 

Let us discuss this systematically for a test of a hypothesis 0 = 0 O against an alternative 
that is a single number 9 x , for simplicity. We let 0 X > 0 O , so that we have a right-sided 
test. For a left-sided or a two-sided test the discussion is quite similar. 

We choose a critical c > 9 0 (as in the upper part of Fig. 532, by methods discussed 
below). From a given sample x l9 • • • , x n we then compute a value 


0 = i. ■ • • , x n ) 


with a suitable g (whose choice will be a main point of our further discussion; for instance, 
take g = ( *! + ■*•+ x n )/n in the case in which 9 is the mean). If 9 > c, we reject the 
hypothesis. If 0 ^ c, we accept it. Here, the value 9 can be regarded as an observed value 
of the random variable 

(4) 0 = g(X ly •••,**) 

because Xj may be regarded as an observed value of Xj, j = 1, •••,«. In this test there 
are two possibilities of making an error, as follows. 

Type I Error (see Table 25.4). The hypothesis is true but is rejected (hence the 
alternative is accepted) because 0 assumes a value 0 > c. Obviously, the probability of 
making such an error equals 

( 5 ) P(9>c)o= 0o = a. 

a is called the significance level of the test, as mentioned before. 

Type II Error (see Table 25.4). The hypothesis is false but is accepted because 0 
assumes a value 9 ^ c. The probability of making such an error is denoted by /?; thus 

( 6 ) P(Q^c) e „ 0l = p. 

7] = 1 — p is called the power of the test. Obviously, the power 77 is the probability of 
avoiding a Type II error. 


Table 25.4 Type I and Type II Errors in Testing a Hypothesis 
6 = 9 0 Against an Alternative 0=0, 



Unknown Truth 
9 ~ 0q B — 9\ 

1 0=00 

0 - 

True decision 
P = \ - a 

Type II error 

P = p 

Acc 

If 

Type 1 error 
P = a 

True decision 
P = 1 - p 
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Formulas (5) and ( 6 ) show that both a and ft depend on c, and we would like to choose 
c so that these probabilities of making errors are as small as possible. But the important 
Figure 533 shows that these are conflicting requirements because to let a decrease we 
must shift c to the right, but then (3 increases. In practice we first choose a (5%, sometimes 
1 %), then determine c , and finally compute /3. If /3 is large so that the power 17 = i — jS 
is small, we should repeat the test, choosing a larger sample, for reasons that will appear 
shortly. 



Acceptance region — — Rejection region (Critical region) 

Fig. 533. Illustration of Type I and II errors in testing a hypothesis 
9 = 0 O against an alternative 6 = 0, (> 9 0 , right-sided test) 

If the alternative is not a single number but is of the form (l)-(3), then j3 becomes a 
function of 6 . This function j 8(6) is called the operating characteristic (OC) of the test 
and its curve the OC curve. Clearly, in this case 77 = 1 — /3 also depends on 9. This 
function ri(0) is called the power function of the test. (Examples will follow.) 

Of course, from a test that leads to the acceptance of a certain hypothesis 0 O , it does 
not follow that this is the only possible hypothesis or the best possible hypothesis. Hence 
the terms “not reject” or “fail to reject” are perhaps better than the term “accept.” 


Test for fi of the Normal Distribution with Known a 2 

The following example explains the three kinds of hypotheses. 

Test for the Mean of the Normal Distribution with Known Variance 

Let X be a normal random variable with variance or 2 = 9. Using a sample of size n- 10 with mean .v, test the 
hypothesis fj. = /xq = 24 against the three kinds of alternatives, namely, 

(a) fx > [Xq (b) jjl < fiQ (c) /M). 


Solution . We choose the significance level a = 0.05. An estimate of the mean will be obtained from 

X = i (X, + • • • + X n ). 

If the hypothesis is true, X is normal with mean fi = 24 and variance cr 2 //t = 0.9. see Theorem I. Sec. 25.3. 
Hence we may obtain the critical value c from Table A8 in App. 5. 

Case (a), Right-Sided Test . We determine c from P{X > c)^ 2 4 = a = 0.05, that is, 

P(X £ c)„ m 24 = = I “ “ = 0.95. 

Table A 8 in App, 5 gives (c - 24)fr/o3 = J.645, and c = 25.56, which is greater than fi Qy as in the upper 
part of Fig. 532. If ,v ^ 25.56, the hypothesis is accepted. If .v > 25.56, it is rejected. The power function of 
the test is (Fig. 534) 



SEC 25.4 Testing of Hypotheses. Decisions 


1063 



Fig. 534. Power function t?(m) in Example 2, case (a) (dashed) and case (c) 


(7) 


Case (b). 


tj(M) = PVC > 25.56)^ = l -P(X£ 25.56)^ 

/ 25.56 - a \ 

- ' - $( ) « i - $(26.94 - 1.05 m) 

Left-Sided Test. The critical value c is obtained from the equation 


/ c - 24 \ 

P(X * c’V„24 = ) = « * 0.05. 

Table AS in App. 5 yields c - 24 - 1.56 = 22.44. If.? ^ 22.44. we accept the hypothesis. If.? < 22.44, we 
reject it. The power function of the test is 


( 8 ) 


*M) = nX ^ 22.44)^ 


*( 


22.44 - jx 

Vos 


) 


= 4>(23.65 - 1.05^). 


Case (c). Two-Sided Test. Since the normal distribution is symmetric, we choose cj and c 2 equidistant from 
fj. = 24. say, cj. = 24 — k and c 2 = 24 + k, and determine k from 

«24 - » Si S » + - *(^-) -■♦(- ^5 ) - ' - « - 0.«. 

Table A8 in App. 5 gives k/VoS = 1.960. hence k = 1.86. This gives the values ci = 24 — 1.86 = 22.14 
and eg = 24 + 1.86 = 25.86. If x is not smaller than c t and not greater than C 2 , we accept the hypothesis. 
Otherwise we reject it. The power function of the test is (Fig. 534) 

V Ox) = P(X < 22.I4) m 4- P(X > 25.86)^ = P(X < 22.14) M + 1 - P(X £ 25.86)„ 


(9) 


= 1 + $ 


/ 22.14 - fx 

\ V09 



25.86 - fx 

Vos 


) 


= I + 0(23.34 - 1.05m) “ $(27.26 - I.OS/u,). 


Consequently, the operating characteristic /3(m) = 1 — 77 (m) (see before) is (Fig. 535) 


/3(M) = $(27.26 - 1.05m) “ $(23.34 - 1.05m). 


If we take a larger sample, say, of size u = 100 (instead of 10), then cr 2 /n = 0.09 (instead of 0.9) and die 
critical values are c x = 23.41 and c 2 = 24,59. as can be readily verified. Then the operating characteristic of 
the test is 


0(m) = $ 


/ 24.59 - m 
l Vfr09 



23.41 - At \ 
VO09 / 


= 0(81.97 - 3.33/u) - $(78.03 - 3.33 n). 
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EXAMPLE 4 


Figure 535 shows that the corresponding OC curve is steeper than that for n = 10. This means that the increase 
of n has led to an improvement of the test, in any practical case, n is chosen as small as possible but so large that 
the test brings out deviations between /x and {jl$ that are of practical interest. For instance, if deviations of ±2 units 
are of interest, we see from Fig. 535 that n = 10 is much too small because when /jl = 24 - 2 = 22 or /x = 24 
+ 2 = 26 & is almost 50%. On the other hand, we see that n = 100 is sufficient for that purpose. ■ 



Fig. 535. Curves of the operating characteristic (OC curves) in 
Example 2, case (c), for two different sample sizes n 


Test for /jl When cr 2 is Unknown, and for a 2 

Test for the Mean of the Normal Distribution with Unknown Variance 

The tensile strength of a sample of n - 16 manila ropes (diameter 3 in.) was measured. The sample mean was 
x = 4482 kg, and the sample standard deviation was s = 1 15 kg (N. C. Wiley, 41st Annual Meeting of the 
American Society for Testing Materials). Assuming that the tensile strength is a normal random variable, test 
the hypothesis /x 0 = 4500 kg against the alternative /xj = 4400 kg. Here /jlq may be a value given by the 
manufacturer, while /jl j may result from previous experience. 

Solution . We choose the significance level a = 5%. If the hypothesis is true, it follows from Theorem 2 in 
Sec. 25.3, that the random variable 


X - Mo X - 4500 
T ~ S/Vii S/4 

has a /-distribution with n — 1 = 15 d.f. The test is left-sided. The critical value c is obtained from 
P(T < c) Mo = a = 0.05. Table A9 in App. 5 gives c — - 1.75. As an observed value of T we obtain from the 
sample / = (4482 - 4500)/(l 15/4) = -0.626. We see that t > c and accept the hypothesis. For obtaining 
numeric values of the power of the test, we would need tables called noncentral Student /-tables; we shall not 
discuss this question here. H 

Test for the Variance of the Normal Distribution 

Using a sample of size n — 15 and sample variance s 2 = 13 from a normal population, test the hypothesis 
< r 2 = cr 0 2 = 10 against the alternative cr z = a 2 = 20, 

Solution . We choose the significance level a = 5%. If the hypothesis is true, then 

S 2 S 2 

Y=(n - 1) — =• = 14 — = 1.4S 2 
(To JO 

has a chi-square distribution with n — J = 14 d.f. by Theorem 3, Sec. 25.3. From 

P(Y > c) = a = 0.05, that is, P(Y^c) = 0.95, 
and Table A 10 in App. 5 with 14 degrees of freedom we obtain c = 23.68. This is the critical value of Y. Hence 
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to S 2 = cr 0 2 Y/(n — 1) = 0 1\AY there corresponds the critical value c* = 0.714*23.68 = 16.91. Since 
s 2 < c* t we accept the hypothesis. 

Tf the alternative is true, the random variable Y x = 14S 2 /o , 1 2 = 0.7S 2 has a chi-square distribution with 14 
d.f. Hence our test has the power 

V - P(S 2 > c*)„^ 20 = P(Y 1 > 0.7 = I - P(Y X § 1 l.84) ffl _ 20 . 

From a more extensive table of the chi-square distribution (e.g. in Ref. [G31 or [08]) or from your CAS. you 
see that 17 62%. Hence the Type II risk is very large, namely, 38%. To make this risk smaller, we would 

have to increase the sample size. H 


Comparison of Means and Variances 

Comparison of the Means of Two Normal Distributions 

Using a sample .v*, • • • . x n ^ from a normal distribution with unknown mean p, x and a sample yj, • • • , y n2 from 
another normal distribution with unknown mean p. y , we want to test the hypothesis that the means are equal, 
fJ'x ~ Afy Against an alternative, say, p. x > fiy. The variances need not be known but are assumed to be equal. 3 
Two cases of comparing means are of practical importance: 

Case A. The samples have the same size. Furthermore , each value of the first sample corresponds to precisely 
one value of the other, because corresponding values result from the same person or thing (paired comparison) — 
for example, two measurements of the same thing by two different methods or two measurements from the two 
eyes of the same person. More generally, they may result from pairs of similar individuals or things, for example, 
identical twins, pairs of used front tires from the same car, etc. Then we should form the differences of 
corresponding values and test the hypothesis that the population corresponding to the differences has mean 0, 
using the method in Example 3. If we have a choice, this method is better than the following. 

Case B. The two samples are independent and not necessarily of the same size. Then we may proceed as 
follows. Suppose that the alternative is /jl x > pL y . We choose a significance level or. Then we compute the sample 
means .v and y as well as (/? x - l)s r 2 and (n 2 “ l)^ 2 , where s 2 and s 2 are the sample variances. Using Table 
A9 in App. 5 with /ij + n 2 — 2 degrees of freedom, we now determine c from 


(10) 


F(T ^ c) = 1 - a. 


We finally compute 
( 11 ) 


'0 - 


;ii + n 2 


- 


x - y 

\)s x 2 + (n 2 - \)s y 2 


It can be shown that tliis is an observed value of a random variable dial has a /-distribution with *1* n 2 — 2 
degrees of freedom, provided the hypothesis is true. If t 0 ^ c, die hypodiesis is accepted. If t 0 > c, it is rejected. 
If the alternative is fji x t p, y , then (10) must be replaced by 


(10*) 


P{T ^ c x ) = 0.5a, P(T ^ c 2 ) = 1 - 0.5a. 


Note that for samples of equal size /?! = n 2 = n , formula (II) reduces to 


( 12 ) 


! 0 = Vn 


Vs x 2 + s 2 


3 This assumption of equality of variances can be tested, as shown in the next example. If the test shows that 
they differ significantly, choose two samples of the same size n x = n 2 = n (not too small. > 30, say), use the 
test in Example 2 together with the fact that (12) is an observed value of an approximately standardized normal 
random variable. 
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To illustrate the computations, let us consider the two samples (.Vj. • • • . .r n ) and (.v a . • • • , y n ) given by 

105 108 86 103 103 107 124 105 

and 

89 92 84 97 103 107 111 97 

showing the relative output of tin plate workers under two different working conditions [J. J. B. Worth, Journal 
of Industrial Engineering 9, 249-253). Assuming that the corresponding populations are normal and have the 
same variance, let us test the hypothesis pu x = p y against the alternative fi v =£ pc y . (Equality of variances will 
be tested in the next example.) 

Solution . We find 

x = 105.125, y = 97.500. s 2 = 106.125, s 2 = 84.000. 

We choose the significance level a = 5%. From (HP) with 0.5a = 2.5%. 1 — 0.5a = 97.5% and Table A9 in 
App. 5 with 14 degrees of freedom we obtain c x = —2.14 and c 2 = 2.14. Formula (12) with n = 8 gives the 
value 

/ 0 = V8 • 7.625/ V 190. 125 = 1.56. 

Since ^ / 0 = c 2 ' we accept the hypothesis p. x = p y that under both conditions the mean output is the same. 

Case A applies to the example because the two first sample values correspond to a certain type of work, the 
next two were obtained in another kind of work, etc. So we may use the differences 

16 16 2 6 0 0 13 8 


of corresponding sample values and the method in Example 3 to test the hypothesis pu = 0, where p. is die mean 
of the population corresponding to the differences. As a logical alternative we take tx # 0. The sample mean is 
d = 7.625, and the sample variance is .v 2 = 45.696. Hence 

i = Vs (7.625 - 0)/V45.696 = 3.19. 

From P(T ^ Ci ) = 2.5%, P{T ^ c 2 ) — 97.5% and Table A9 in App. 5 with n - 1=7 degrees of freedom we 
obtain Ci = —2.36, c 2 = 2.36 and reject the hypothesis because t = 3.19 does not lie between c x and t 2 . Hence 
our present test, in which we used more information (but the same samples), shows that the difference in output 
is significant. H 

Comparison of the Variance of Two Normal Distributions 

Using the two samples in the last example, test the hypothesis cr x 2 = a 2 \ assume that the corresponding 
populations are normal and the nature of the experiment suggests the alternative or 2 > cr y 2 . 

Solution . Wc find s 2 = 106.125, s 2 = 84.000. We choose the significance level a = 5%. Using 
P(V ^ c) = 1 — a - 95 % and Table All in App. 5. with (n x — 1. /i 2 — 1) = (7. 7) degrees of freedom, we 
determine c = 3.79. We finally compute v 0 = s 2 ls 2 - 1,26. Since v 0 ^ c, we accept the hypothesis. If 
v 0 > t\ we would reject it. 

This test is justified by the fact that Vq is an observed value of a random variable that has a so-called 
F-distribution with (n x - I. « 2 — I) degrees of freedom, provided the hypothesis is true. (Proof in Ref. |G3] 
listed in App. I.) The F-distribution with (m, n) degrees of freedom was introduced by R. A. Fisher 4 and has 
the distribution function F(z) - 0 if c < 0 and 

(13) F(z) = K mn f t <m ~ 2)/2 (mt + ;,)“ (m+n>/2 dt (z ^ 0). 

J 0 

where K mn = m m,2 n n,2 T(\in + g/i)/r(|//i)r(|«). (For f sec App. A3.I.) I 


4 After the pioneering work of the English statistician and biologist, KARL PEARSON (1857-1936), the 
founder of the English school of statistics, and WILLIAM SEALY GOSSET (1876-1937), who discovered the 
/-distribution (and published under the name ‘Studenf), the English statistician Sir RONALD AYLMER FISHER 
(1890-1962). professor of eugenics in London (1933-1943) and professor of genetics in Cambridge, England 
(1943-1957) and Adelaide. Australia (1957-1962), had great influence on the further development of modern 
statistics. 
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This long section contained the basic ideas and concepts of testing, along with typical 
applications and you may perhaps want to review it quickly before going on, because the 
next sections concern an adaption of these ideas to tasks of great practical importance and 
resulting tests in connection with quality control, acceptance (or rejection) of goods 
produced, and so on. 


PROBLEM SET 25.4 


1. Test (A = 0 against fx > 0. assuming normality and 
using the sample 1, — I, 1, 3, —8. 6, 0 (deviations of 
the azimuth [multiples of 0.01 radian] in some 
revolution of a satellite). Choose a = 5%. 

2. In one of his classical experiments Bui'fon obtained 
2048 heads in tossing a coin 4040 times. Was the coin 
fair? 

3. Do the same test as in Prob. 2, using a result by 
K. Pearson, who obtained 6 019 heads in 12 000 trials. 

4. Assuming normality and known variance cr 2 = 4, test 
the hypothesis jll = 30.0 against the alternative (a) 
l± = 28.5, (b) At = 30.7. using a sample of size 10 with 
mean „v = 28.5 and choosing a = 5%. 

5. How does the result in Prob. 4(a) change if we use a 
smaller sample, say. of size 4, the other data (.v = 28.5, 
a = 5%, etc.) remaining as before? 

6. Detemine the power of the test in Prob. 4(a). 

7. What is the rejection region in Prob. 4 in the case of a 
two-sided test with a = 5 % ? 

8. Using the sample 0.80, 0.81. 0.81, 0.82. 0.81, 0.82, 
0.80, 0.82, 0.81, 0.81 (length of nails in inches), test 
the hypothesis ji — 0.80 in. (the length indicated on 
the box) against the alternative fx r= 0.80 in. (Assume 
normality, choose a = 5%.) 

9. A firm sells oil in cans containing 1000 g oil per can 
and is interested to know whether the mean weight 
differs significantly from lOOOg at the 5% level, in 
which case the filling machine has to be adjusted. Set 
up a hypothesis and an alternative and perform the test, 
assuming normality and using a sample of 20 fillings 
with mean 996 g and standard deviation 5 g. 

10. If a sample of 50 tires of a certain kind has a mean life 
of 32 000 mi and a standard deviation of 4000 mi, can 
the manufacturer claim that the true mean life of such 
tires is greater than 30 000 mi? Set up and test a 
corresponding hypothesis at a 5% level, assuming 
normality. 

11. If simultaneous measurements of electric voltage by 
two different types of voltmeter yield the differences 
(in volts) 0.8, 0.2. -0.3. 0.1. 0.0. 0.5, 0.7, 0.2, can we 
assert at the 5% level that there is no significant 
difference in the calibration of the two types of 
instruments? (Assume normality.) 


12. If a standard medication cures about 70% of patients 
with a certain disease and a new medication cured 148 
of the first 200 patients on whom it was tried, can we 
conclude that the new medication is better? (Choose 
a = 5%.) 

13. Suppose that in the past the standard deviation of 
weights of certain 25.0-oz packages filled by a machine 
was 0.4 oz. Test the hypothesis H 0 : or = 0.4 against 
the alternative H x : cr > 0.4 (an undesirable increase), 
using a sample of 10 packages with standard deviation 
0.5 oz and assuming normality. (Choose a = 5%.) 

14. Suppose that in operating battery-powered electrical 
equipment, it is less expensive to replace all batteries 
at fixed intervals than to replace each battery 
individually when it breaks down, provided the 
standard deviation of the lifetime is less than a certain 
limit, say, less than 5 hours. Set up and apply a suitable 
test, using a sample of 28 values of lifetimes with 
standard deviation s = 3.5 hours and assuming 
normality: choose a = 5%. 

15. Brand A gasoline was used in 9 automobiles of the 
same model under identical conditions. The 
corresponding sample of 9 values (miles per gallon) 
had mean 20.2 and standard deviation 0.5. Under the 
same conditions, high-power brand B gasoline gave a 
sample of 10 values with mean 21.8 and standard 
deviation 0.6. Is the mileage of B significantly better 
than that of A? (Test at the 5% level; assume 
normality.) 

16. The two samples 70, 80. 30. 70, 60. 80 and 140. 120. 
130. 120, 120, 130, 120 are values of the differences 
of temperatures (°C) of iron at two stages of casting, 
taken from two different crucibles. Is the variance of 
the first population larger than that of the second? 
(Assume normality. Choose a — 5 %.) 

17. Using samples of sizes 10 and 16 with variances 
s 2 = 50 and s v 2 = 30 and assuming normality of the 
corresponding populations, test the hypothesis 
H 0 : cr 2 = cr 2 against the alternative or 2 > a y 2 . 
Choose a = 5%. 

18. Assuming normality and equal variance and using 
independent samples with tt x = 9, .v = 12, .v r = 2. 
n 2 = 9, y = 15, s u = 2, test H 0 : jx x = jx y against 
fx x & fx y ; choose a = 5%. 
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19. Show that for a normal distribution the two types of 
errors in a test of a hypothesis Ii 0 : jx = jxq against an 
alternative H x : /x = jx x can be made as small as one 
pleases (not zero) by taking the sample sufficiently large. 

20. CAS EXPERIMENT. Tests of Means and 
Variances, (a) Obtain 100 samples of size 10 each from 


the normal distribution with mean 100 and variance 25. 
For each sample test the hypothesis jx 0 — 100 against 
the alternative /x x > 100 at the level of a = 10%. Record 
the number of rejections of the hypothesis. Do the whole 
experiment once more and compare. 

(b) Set up a similar experiment for the variance of a 
normal distribution and perform it 100 times. 


25 .! Quality Control 

The ideas on testing can be adapted and extended in various ways to serve basic practical 
needs in engineering and other fields. We show this in the remaining sections for some 
of the most important tasks solvable by statistical methods. As a first such area of 
problems, we discuss industrial quality control, a highly successful method used in 
various industries. 

No production process is so perfect that all the products are completely alike. There 
is always a small variation that is caused by a great number of small, uncontrollable 
factors and must therefore be regarded as a chance variation. It is important to make 
sure that the products have required values (for example, length, strength, or whatever 
property may be essential in a particular case). For this purpose one makes a test of the 
hypothesis that the products have the required property, say, fi = /x 0 , where /x 0 is a 
required value. If this is done after an entire lot has been produced (for example, a lot 
of 100 000 screws), the test will tell us how good or how bad the products are, but it 
it obviously too late to alter undesirable results. It is much better to test during the 
production run. This is done at regular intervals of time (for example, every hour or 
half-hour) and is called quality control. Each time a sample of the same size is taken, 
in practice 3 to 10 times. If the hypothesis is rejected, we stop the production and look 
for the cause of the trouble. 

If we stop the production process even though it is progressing properly, we make a 
Type I eiror. If we do not stop the process even though something is not in order, we 
make a Type II error (see Sec. 25.4). The result of each test is marked in graphical form 
on what is called a control chart. This was proposed by W. A. Shewhart in 1924 and 
makes quality control particularly effective. 


Control Chart for the Mean 

An illustration and example of a control chart is given in the upper part of Fig. 536. This 
control chart for the mean shows the lower control limit LCL, the center control line 
CL, and the upper control limit UCL. The two control limits correspond to the critical 
values Cj and c 2 in case (c) of Example 2 in Sec. 25.4. As soon as a sample mean falls 
outside the range between the control limits, we reject the hypothesis and assert that the 
production process is “out of control”; that is, we assert that there has been a shift in 
process level. Action is called for whenever a point exceeds the limits. 

If we choose control limits that are too loose, we shall not detect process shifts. On the 
other hand, if we choose control limits that are too tight, we shall be unable to run the 
process because of frequent searches for nonexistent trouble. The usual significance level 
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is a = 1%. From Theorem 1 in Sec. 25.3 and Table A8 in App. 5 we see that in the case 
of the normal distribution the corresponding control limits for the mean are 

(1) LCL = fx o - 2.58 f UCL = Mo + 2.58 . 

V/7 Vrt 

Here a is assumed to be known. If a is unknown, we may compute the standard deviations 
of the first 20 or 30 samples and take their arithmetic mean as an approximation of cr. 
The broken line connecting the means in Fig. 536 is merely to display the results. 

Additional, more subtle controls are often used in industry. For instance, one observes 
the motions of the sample means above and below the centerline, which should happen 
frequently. Accordingly, long runs (conventionally of length 7 or more) of means all above 
(or all below) the centerline could indicate trouble. 



Sample no. 5 10 


Fig. 536. Control charts for the mean (upper part of figure) and 
the standard deviation in the case of the samples on p. 1070 
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Table 25.5 Twelve Samples of Five Values Each 
(Diameter of Small Cylinders, Measured in Millimeters) 


Sample 

Number 

Sample Values 

X 

s 

R 

J 

4.06 

4.08 

4.08 

4.08 

4.10 

4.080 

0.014 

0.04 

2 

4.10 

4.10 

4.12 

4.12 

4.12 

4.112 

0.011 

0.02 

3 

4.06 

4.06 

4.08 

4.10 

4.12 

4.084 

0.026 

0.06 

4 

4.06 

4.08 

4.08 

4.10 

4.12 

4.088 

0.023 

0.06 

5 

4.08 

4.10 

4.12 

4.12 

4.12 

4.108 

0.018 

0.04 

6 

4.08 

4.10 

4.10 

4.10 

4.12 

4.100 

0.014 

0.04 

7 

4.06 

4.08 

4.08 

4.10 

4.12 

4.088 

0.023 

0.06 

8 

4.08 

4.08 

4.10 

4.10 

4.12 

4.096 

0.017 

0.04 

9 

4.06 

4.08 

4.10 

4.12 

4.14 

4.100 

0.032 

0.08 

10 

4.06 

4.08 

4.10 

4.12 

4.16 

4.104 

0.038 

0.10 

11 

4.12 

4.14 

4.14 

4.14 

4.16 

4.140 

0.014 

0.04 

12 

4.14 

4.14 

4.16 

4.16 

4.16 

4.152 

0.011 

0.02 


Control Chart for the Variance 

In addition to the mean, one often controls the variance, the standard deviation, or the 
range. To set up a control chart for the variance in the case of a normal distribution, we 
may employ the method in Example 4 of Sec. 25.4 for determining control limits. It is 
customary to use only one control limit, namely, an upper control limit. Now from Example 
4 of Sec. 25.4 we have S 2 = a 0 2 YI(n — 1), where because of our normality assumption 
the random variable Y has a chi-square distribution with n — 1 degrees of freedom. Hence 
the desired control limit is 


( 2 ) 


UCL = 


cr 2 c 
n — 1 


where c is obtained from the equation 

P[Y > c) = a, that is, P(Y ^ c) = \ - a 

and the table of the chi-square distribution (Table A 10 in App. 5) with n — 1 degrees of 
freedom (or from your CAS); here a (5% or 1%. say) is the probability that in a properly 
running process an observed value s 2 of S 2 is greater than the upper control limit. 

If we wanted a control chart for the variance with both an upper control limit UCL and 
a lower control limit LCL, these limits would be 

a 2 c\ <r 2 Co 

(3) LCL = V and UCL = — . 

n - 1 7? — I 


where and c 2 are obtained from Table A 10 with n - 1 d.f. and the equations 




P(Y^c 2 ) = 


a 
2 * 


(4) 


and 
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Control Chart for the Standard Deviation 

To set up a control chart for the standard deviation, we need an upper control limit 

crVc 

(5) UCL = - 

vn — I 

obtained from (2). For example, in Table 25.5 we have n = 5. Assuming that the 
corresponding population is normal with standard deviation a = 0.02 and choosing 
a = 1 %, we obtain from the equation 


P(Y^c)= 1 — « = 99% 


and Table A10 in App. 5 with 4 degrees of freedom the critical value c = 13.28 and from 
(5) the corresponding value 


UCL = 


0.02 Vl 3.28 

vs 


= 0.0365, 


which is shown in the lower part of Fig. 536. 

A control chart for the standard deviation with both an upper and a lower control limit 
is obtained from (3). 


Control Chart for the Range 

Instead of the variance or standard deviation, one often controls the range R (= largest 
sample value minus smallest sample value), it can be shown that in the case of the normal 
distribution, the standard deviation or is proportional to the expectation of the random 
variable for which R is an observed value, say, or = A w £(/?*), where the factor of 
proportionality A n depends on the sample size n and has the values 


n 

2 

3 

4 

5 

6 

7 

8 

9 

10 

\ n = crlE(R*) 

0.89 

0.59 

0.49 

0.43 

0.40 

0.37 

0.35 

0.34 

0.32 

n 

12 

14 

16 

18 

20 

30 

40 

50 

A n = cr/E(R*) 

0.31 

0.29 

0.28 

0.28 

0.27 

0.25 

0.23 

0.22 


Since R depends on two sample values only, it gives less information about a sample 
than s does. Clearly, the larger the sample size n is, the more information we lose in using 
R instead of s. A practical rule is to use s when n is larger than 10. 


ER&ffLEH SIT 25 . 5 


1. Suppose a machine for filling cans with lubricating oil 
is set so that it will generate fillings which form a 
normal population with mean I gal and standard 
deviation 0.03 gal. Set up a control chart of the type 
shown in Fig. 536 for controlling the mean (that is, find 
LCL and UCL), assuming that the sample size is 6. 

2. (Three-sigma control chart) Show that in Prob. 1. the 


requirement of the significance level a - 0.3% leads 
to LCL = /jl, — 3cr/^/n and UCL = }i + 3o/V/7, and 
find the corresponding numeric values. 

3. What sample size should we choose in Prob. J if we 
want LCL and UCL somewhat closer together, say, 
UCL — LCL = 0.05. without changing the significance 
level? 
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4. How does the meaning of the control limits ( 1 ) change 
if we apply a control chart with these limits in the case 
of a population that is not normal? 

5. How should we change the sample size in controlling the 
mean of a normal population if we want the difference 

UCL - LCL 

to decrease to half its original value? 

6. What LCL and UCL should we use instead of (1) if 
instead of .v we use the sum *| + • • • 4- .v„ of the 
sample values? Determine these limits in the case of 
Fig. 536. 

7. Ten samples of size 2 were taken from a production 
lot of bolts. The values (length in mm) are as shown. 
Assuming that the population is normal with mean 27.5 
and variance 0.024 and using (1). set up a control chart 
for the mean and graph the sample means on the chart. 

Sample , 2 3 4 5 6 7 8 9 10 


, . 27.4 27.4 27.5 27.3 27.9 27.6 27.6 27.8 27.5 27.3 

Length 

27.6 27.4 27.7 27.4 27.5 27.5 27.4 27.3 27.4 27.7 

8. Graph the means of the following 10 samples 
(thickness of washers, coded values) on a control chart 
for means, assuming that the population is normal with 
mean 5 and standard deviation 1.55. 


Time 8:00 8:30 9:00 9:30 10:00 10:30 1 1:00 1 1:30 12:00 12:30 



3 

3 

5 

7 

7 

4 

5 

6 

5 

5 

Sample 

4 

6 

2 

5 

3 

4 

6 

4 

5 

2 

Values 

8 

6 

5 

4 

6 

3 

4 

6 

6 

5 


4 

8 

6 

4 

5 

6 

6 

4 

4 

3 


9. Graph the ranges of the samples in Prob. 8 on a control 
chart for ranges, 

10. What effect on UCL - LCL does it have if we double 
the sample size? If we switch from a = 1% to a = 5%? 

11. Since the presence of a point outside control limits for 
the mean indicates trouble (“the process is out of 
control”)* how often would we be making the mistake 
of looking for nonexistent trouble if we used (a) 1 -sigma 
limits, (b) 2-sigma limits? (Assume normality.) 

12. Graph X n = <r/E(R*) as a function of n . Why is A n a 
monotone decreasing function of / 2 ? 

13. (Number of defectives) Find formulas for die UCL, 
CL, and LCL (corresponding to 3o , -limits) in the case 
of a control chart for the number of defectives, 
assuming that in a state of statistical control the fraction 
of defectives is p. 


14. How would progressive tool wear in an automatic lathe 
operation be indicated by a control chart of the mean? 
Answer the same question for a sudden change in the 
position of the tool in that operation. 

15. (Number of defects per unit) A so-called c-chart or 
defects-per-unit chart is used for the control of the 
number X of defects per unit (for instance, the number 
of defects per 10 meters of paper, the number of 
missing rivets in an airplane wing, etc.) (a) Set up 
formulas for CL and LCL, UCL corresponding to 

p. ± 3cr, 

assuming that X has a Poisson distribution, (b) Compute 
CL, LCL, and UCL in a control process of the number 
of imperfections in sheet glass; assume that this number 
is 2.5 per sheet on the average when the process is 
under control. 

16. (Attribute control charts). Twenty samples of size 
100 were taken from a production of containers. The 
numbers of defectives (leaking containers) in those 
samples (in the order observed) were 

376145497056 13 4 
9 0 2 1 12 8. 

From previous experience it was known that the 
average fraction defective is p = 5% provided that 
die process of production is running properly. Using 
the binomial distribution, set up a fraction defective chan 
(also called a p-chart), that is, choose the LCL = 0 
and determine the UCL for the fraction defective (in 
percent) by the use of 3-sigma limits, where cr 2 is the 
variance of the random variable 

X = Fraction defective in a sample of size 100. 

Is the process under control? 

17. CAS PROJECT. Control Charts, (a) Obtain 100 
samples of 4 values each from the normal distribution 
with mean 8.0 and variance 0.16 and their means, 
variances, and ranges. 

(b) Use these samples for making up a control chart 
for the mean. 

(c) Use them on a control chart for the standard 
deviation. 

(d) Make up a control chart for the range. 

(e) Describe quantitative properties of the samples 
that you can see from those charts (e.g., whether the 
corresponding process is under control, whether the 
quantities observed vary randomly, etc.). 
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25.6 Acceptance Sampling 

Acceptance sampling is usually done when products leave the factory (or in some cases 
even within the factory). The standard situation in acceptance sampling is that a producer 
supplies to a consumer (a buyer or wholesaler) a lot of N items (a carton of screws, for 
instance). The decision to accept or reject the lot is made by determining the number x 
of defectives (= defective items) in a sample of size n from the lot. The lot is accepted 
if a* = c, where c is called the acceptance number, giving the allowable number of 
defectives. If a* > c, the consumer rejects the lot. Clearly, producer and consumer must 
agree on a certain sampling plan giving n and c. 

From the hypergeometric distribution we see that the event A: “Accept the lot” has 
probability (see Sec. 24.7) 


(1) P(A) = P(X S c) - 2 M ( N / M 

r==0 \x / \ n — x J / \nj 

where M is the number of defectives in a lot of N items. In terms of the fraction defective 
0 = MIN we can write (1) as 


( 2 ) 


P(A; 6) = 2 

rc=0 



P(A; 0) can assume n + 1 values corresponding to 0 = 0, UN, 2 IN, • • • , NIN; here, n 
and c are fixed. A monotone smooth curve through these points is called the operating 
characteristic curve (OC curve) of the sampling plan considered. 


EXAMPLE 1 Sampling Plan 


Suppose that certain tool bits are packaged 20 to a box, and the following sampling plan is used. A sample of 
two tool bits is drawn, and the corresponding box is accepted if and only if both bits in the sample are good. 
In this case. N - 20. n = 2. c = 0, and (2) takes the form (a factor 2 drops out) 


P(A; 0) = 



(20 - 203X19 - 20$) 
380 


The values of P{A> $) for 0 = 0, 1/20, 2/20, • • • . 20/20 and the resulting OC curve are shown in Fig. 537 on 
p. 1074. (Verify!) ■ 


In most practical cases 0 will be small (less than 10%). Then if we take small samples 
compared to N, we can approximate (2) by the Poisson distribution (Sec. 24.7); thus 


/>(/!;<?) 4 
,=o * ! 


(3) 


(/i = nd). 
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EXAMPLE 2 




Fig. 537. OC curve of the sampling plan with n = 2 Fig. 538. OC curve in Example 2 

and c = 0 for lots of size N = 20 

Sampling Plan. Poisson Distribution 

Suppose that for large lots the following sampling plan is used. A sample of size n = 20 is taken. If it contains 
not more than one defective, the lot is accepted. If the sample contains two or more defectives, the lot is rejected. 
In this plan, we obtain from (3) 


P(A; 0) ~ e~ 20u (\ + 20 0). 

The corresponding OC curve is shown in Fig. 538. M 

Errors in Acceptance Sampling 

We show how acceptance sampling fits into general test theory (Sec. 25.4) and what this 
means from a practical point of view. The producer wants the probability a of rejecting 
an acceptable lot (a lot for which 6 does not exceed a certain number 0 Q on which the 
two parties agree) to be small. 0 O is called the acceptable quality level (AQL). Similarly, 


PiA\Q) 



Good [ Indifference | Poor 

material, zone j material 


Fig. 539. OC curve, producer's and consumer's risks 
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the consumer (the buyer) wants the probability /? of accepting an unacceptable lot (a lot 
for which 0 is greater than or equal to some Si) to be small. 0 1 is called the lot tolerance 
percent defective (LTPD) or the rejectable quality level (RQL). a is called producer’s 
risk. It corresponds to a Type I error in Sec. 25.4. /3 is called consumer’s risk and 
corresponds to a Type II error. Figure 539 shows an example. We see that the points 
(0 O , 1 — a) and (0 X , /3) lie on the OC curve. It can be shown that for large lots we can 
choose 0 O , 0 1 (> 0 O ), a , /3 and then determine n and c such that the OC curve runs very 
close to those prescribed points. Table 25.6 shows the analogy between acceptance 
sampling and hypothesis testing in Sec. 25.4. 


Table 25.6 Acceptance Sampling and Hypothesis Testing 


Acceptance Sampling 

Hypothesis Testing 

Acceptable quality level (AQL) 0 = 0 O 
Lot tolerance percent defectives (LTPD) 
0=0 X 

Allowable number of defectives c 
Producer's risk a of rejecting a lot 
with 0 ^ 0 () 

Consumer’s risk (3 of accepting a lot 
with 0=0 t 

Hypothesis 0 = 0 O 
Alternative 0 = 0 X 
Critical value c 

Probability a of making a Type I error 
(significance level) 

Probability (3 of making a Type II error 


Rectification 

Rectification of a rejected lot means that the lot is inspected item by item and all defectives 
are removed and replaced by nondefective items. (This may be too expensive if the lot is 
cheap; in this case the lot may be sold at a cut-rate price or scrapped.) If a production 
turns out 1000% defectives, then in K lots of size N each, KN6 of the KN items are 
defectives. Now KP(A: 0) of these lots are accepted. These contain KPN6 defectives, 
whereas the rejected and rectified lots contain no defectives, because of the rectification. 
Hence after the rectification the fraction defective in all K lots equals KPN6/KN. This is 
called the average outgoing quality (AOQ); thus 

( 4 ) AOQ(0) = 0P{A; 0). 



Fig. 540. OC curve and AOQ curve for the sampling plan in Fig. 537 
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Figure 540 on p. 1075 shows an example. Since AOQ(O) = 0 and P(A; 1) = 0, the AOQ 
curve has a maximum at some 0 = 0*, giving the average outgoing quality limit (AOQL). 
This is the worst average quality that may be expected to be accepted under rectification. 


HT ~ E 


1. Lots of knives are inspected by a sampling plan that 
uses a sample of size 20 and the acceptance number 
c = 1. What are probabilities of accepting a lot with 
1%, 2%, 10% defectives (dull blades)? Use Table A6 
in App. 5. Graph the OC curve. 

2. What happens in Prob. 1 if the sample size is increased 
to 50? First guess. Then calculate. Graph the OC curve 
and compare. 

3. How will the probabilities in Prob. I with n = 20 change 
(up or down) if we decrease c to zero? First guess. 

4. What are die producer’s and consumer’s risks in 
Prob. 1 if the AQL is 1.5% and the RQL is 7.5%? 

5. Large lots of batteries are inspected according to the 
following plan, n = 30 batteries are randomly drawn 
from a lot and tested. If this sample contains at most 
c = 1 defective battery, the lot is accepted. Otherwise 
it is rejected. Graph the OC curve of the plan, using 
the Poisson approximation. 

6. Graph the AOQ curve in Prob. 5. Determine the 
AOQL, assuming that rectification is applied. 

7. Do the work required in Prob. 5 if n = 50 and c — 0. 

8. Find the binomial approximation of the hypergeometric 
distribution in Example 1 and compare the approximate 
and the accurate values. 

9. In Example 1, what are the producer’s and consumer’s 
risks if the AQL is 0.1 and the RQL is 0.6? 

10. Calculate P(A; 6) in Example 1 if the sample size is 
increased from n = 2 to n = 3, the other data remaining 
as before. Compute P(A; 0.10) and P(A; 0.20) and 


compare with Example 1 . 

11. Samples of 5 screws are drawn from a lot with fraction 
defective 0. The lot is accepted if the sample contains 
(a) no defective screws, (b) at most 1 defective screw. 
Using the binomial distribution, find, graph, and 
compare the OC curves. 

12. Find the risks in the single sampling plan with n = 5 
and c = 0, assuming that the AQL is 0 O = 1 % and the 
RQL is 9 X = 15%. 

13. Why is it impossible for an OC curve to have a vertical 
portion separating good from poor quality? 

14. If in a single sampling plan for large lots of spark plugs, 
the sample size is 100 and we want the AQL to be 5% 
and the producer’s risk 2%, what acceptance number 
c should we choose? (Use the normal approximation.) 

15. What is the consumer’s risk in Prob. 14 if we want the 
RQL to be 12%? 

16. Graph and compare sampling plans with c = 1 and 
increasing values of n, say, n = 2, 3, 4. (Use the 
binomial distribution.) 

17. Samples of 3 fuses are drawn from lots and a lot is 
accepted if in the corresponding sample we find no 
more than 1 defective fuse. Criticize this sampling plan. 
In particular, find the probability of accepting a lot that 
is 50% defective. (Use the binomial distribution.) 

18. Graph the OC curve and the AOQ curve for the single 
sampling plan for large lots with n = 5 and c = 0, and 
find the AOQL. 


25.7 Goodness of Fit. ^ 2 -Test 

To test for goodness of fit means that we wish to test that a certain function F( x) is the 
distribution function of a distribution from which we have a sample .v lt • • • , x n . Then we 
test whether the sample distribution function F{x) defined by 

F(x) = Sum of the relative frequencies of all sample values Xj not exceeding x 

fits F(x) “sufficiently well.” If this is so, we shall accept the hypothesis that F(x) is the 
distribution function of the population; if not, we shall reject the hypothesis. 
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This test is of considerable practical importance, and it differs in character from the 
tests for parameters (/x, cr 2 , etc.) considered so far. 

To test in that fashion, we have to know how much F(x) can differ from F(x) if the 
hypothesis is true. Hence we must first introduce a quantity that measures the deviation 
of F(x) from F(*), and we must know the probability distribution of this quantity under 
the assumption that the hypothesis is true. Then we proceed as follows. We determine a 
number c such that if the hypothesis is true, a deviation greater than c has a small 
preassigned probability. If t nevertheless, a deviation greater than c occurs, we have reason 
to doubt that the hypothesis is true and we reject it. On the other hand, if the deviation 
does not exceed c, so that F(x) approximates F(x) sufficiently well, we accept the 
hypothesis. Of course, if we accept the hypothesis, this means that we have insufficient 
evidence to reject it, and this does not exclude the possibility that there are other functions 
that would not be rejected in the test. In this respect the situation is quite similar to that 
in Sec. 25.4. 

Table 25.7 shows a test of that type, which was introduced by R. A. Fisher. This test 
is justified by the fact that if the hypothesis is true, then Xo 2 IS an observed value of a 
random variable whose distribution function approaches that of the chi-square distribution 
with K — 1 degrees of freedom (or K — r — 1 degrees of freedom if r parameters are 
estimated) as n approaches infinity. The requirement that at least five sample values lie 
in each interval in Table 25.7 results from the fact that for finite n that random variable 
has only approximately a chi-square distribution. A proof can be found in Ref. [G3] listed 
in App. 1. If the sample is so small that the requirement cannot be satisfied, one may 
continue with the test, but then use the result with caution. 


Table 25.7 Chi-square Test for the Hypothesis That F(x) is the Distribution Function 
of a Population from Which a Sample x lf • • • , x n is Taken 


Step L Subdivide the A-axis into K intervals l ly / 2 , • ■ ■ , l K such that each interval contains 
at least 5 values of the given sample at, * • • , x n . Determine the number bj of sample 
values in the interval /,, where j = 1, * • • , K. If a sample value lies at a common 
boundary point of two intervals, add 0.5 to each of the two corresponding bj. 

Step 2. Using F(x\ compute the probability pj that the random variable X under 
consideration assumes any value in the interval Ij. where j = 1, • • • , K. Compute 


ej = npj. 


(This is the number of sample values theoretically expected in if the hypothesis 
is true.) 

Step 3. Compute the deviation 


( 1 ) 


K 


Xo 


= 2 


(frj - e j) 2 


j=l 


Step 4. Choose a significance level (5%, 1%, or the like). 
Step 5. Determine the solution c of the equation 


P(X 2 = c) = l — a 

from the table of the chi-sqare distribution with K — 1 degrees of freedom (Table 
A 10 in App. 5). If / parameters of F(x) are unknown and their maximum likelihood 
estimates (Sec. 25.2) are used, then use K — r - 1 degrees of freedom (instead 
of K - I). If Xo 2 = C, accept the hypothesis. If x 0 2 > c, reject the hypothesis. 
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Table 25.8 Sample of 100 Values of the Splitting Tensile Strength (lb/in. 2 ) 
of Concrete Cylinders 


320 

380 

340 

410 

380 

340 

360 

350 

320 

370 

350 

340 

350 

360 

370 

350 

380 

370 

300 

420 

370 

390 

390 

440 

330 

390 

330 

360 

400 

370 

320 

350 

360 

340 

340 

350 

350 

390 

380 

340 

400 

360 

350 

390 

400 

350 

360 

340 

370 

420 

420 

400 

350 

370 

330 

320 

390 

380 

400 

370 

390 

330 

360 

380 

350 

330 

360 

300 

360 

360 

360 

390 

350 

370 

370 

350 

390 

370 

370 

340 

370 

400 

360 

350 

380 

380 

360 

340 

330 

370 

340 

360 

390 

400 

370 

410 

360 

400 

340 

360 


D. L. IVEY, Splitting tensile tests on structural lightweight aggregate concrete. Texas Transportation 
Institute, College Station, Texas. 


EXAMPLE 1 Test of Normality 

Test whether the population from which the sample in Table 25.8 wits taken is normal. 

Solution, Table 25.8 shows the values (column by column) in the order obtained in the experiment. Table 
25.9 gives the frequency distribution and Fig. 541 the histogram. It is hard to guess the outcome of the 
test — does the histogram resemble a normal density curve sufficiently well or not? 

The maximum likelihood estimates for /x and cr 2 are jx = .v = 364.7 and tr 2 = 712.9. The computation in 
Table 25.10 yields * 0 2 = 2.942. It is very interesting that the interval 375 • • • 385 contributes over 50% of 
Xo 2 - Prom the histogram we see that the corresponding frequency looks much too small. The second largest 
contribution comes from 395 • ■ • 405. and the histogram shows that the frequency seems somewhat too large, 
which is perhaps not obvious from inspection. 


Table 25.9 Frequency Table of the Sample in Table 25.8 


I 

2 

3 

4 

5 

Tensile 

Absolute 

Relative 

Cumulative 

Cumulative 

Strength 

Frequency 

Frequency 

Absolute 

Relative 

X 



Frequency 

Frequency 

[lb/in. 2 ] 


7(a) 


m 

300 

2 

0.02 

2 

0.02 

310 

0 

0.00 

2 

0.02 

320 

4 

0.04 

6 

0.06 

330 

6 

0.06 

12 

0.12 

340 

11 

0.11 

23 

0.23 

350 

14 

0.14 

37 

0.37 

360 

16 

0.16 

53 

0.53 

370 

15 

0.15 

68 

0.68 

380 

8 

0.08 

76 

0.76 

390 

10 

0.10 

86 

0.86 

400 

8 

0.08 

94 

0.94 

410 

2 

0.02 

96 

0.96 

420 

3 

0.03 

99 

0.99 

430 

0 

0.00 

99 

0.99 

440 

1 

0.01 

100 

1.00 




SEC 25.7 Goodness of Fit. * 1 2 3 4 -Test 


1079 



[lb./in. 2 ] 

Fig. 541. Frequency histogram of the sample in Table 25.8 

We choose a « 5%. Since K = 10 and we estimated r = 2 parameters we have to use Table A10 in App. 5 
with K - r — 1 = 7 degrees of freedom. We find c = 14.07 as the solution of P(% 2 = c) = 95%. Since 
Xq 2 < c, we accept the hypothesis that the population is normal. M 


Table 25.10 Computations in Example 1 


Xj 

Xj - 364.7 
26.7 

D 


n 


Term in (l) 

-oo . . . 325 

— OO • • • 

-1.49 



6.81 

6 


325 • 

• 335 

-1.49 • • • 

-1.11 

0.0681 


6.54 

6 

0.045 

335 • 

• 345 

“1.11 * * • 

-0.74 

0.1335 


9.61 

11 


345 • 

• 355 

-0.74 • • • 

-0.36 

0.2296 

• • • 0.3594 

12.98 

14 

0.080 

355 • 

• 365 

-0.36 • • • 

0.01 

0.3594 

• • • 0.4960 

13.66 

16 


365 • 

• 375 

0.01 • • • 

0.39 

0.4960 


15.57 

15 

0.021 

375 • 

• 385 

0.39 • • • 

0.76 

0.6517 

• • • 0.7764 

12.47 

8 


385 • 

• 395 

0.76 • • • 

1.13 

0.7764 

• • • 0.8708 

9.44 

10 


395 • 

•405 

1.13 • • • 

1.51 

0.8708 

• • • 0.9345 

6.37 

8 

0.417 

8 

o 

1.51 • • • 

00 

0.9345 


6.55 

6 

0.046 


Xo Z = 2.942 


PROBLEM SET 25.7 


1. If 100 flips of a coin result in 30 heads and 70 tails, 
can we assert on the 5% level that the coin is fair? 

2. If in 10 flips of a coin we get the same ratio as in 
Prob. 1 (3 heads and 7 tails), is the conclusion the same 
as in Prob. 1? First conjecture, then compute. 

3. What would be the smallest number of heads in 
Prob. I under which the hypothesis “Fair coin” is still 
accepted (with a = 5%)? 

4. If in rolling a die 180 times we get 39, 22, 41, 26, 20, 
32, can we claim on the 5% level that the die is fair? 


5. Solve Prob. 4 if the sample is 25, 31, 33, 27, 29, 35. 

6. A manufacturer claims that in a process of producing 
kitchen knives, only 2.5% of the knives are dull. Test 
the claim against the alternative that more than 2.5% 
of the knives are dull, using a sample of 400 knives 
containing 17 dull ones. (Use a = 5%.) 


7. Between I p.m. and 2 p.m. on five consecutive days 
(Monday through Friday) a certain service station has 
92, 60, 66, 62, and 90 customers, respectively. Test the 
hypothesis that the expected number of customers during 
that hour is the same on those days. (Use a = 5%.) 
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8. Test for normality at the 1% level using a sample of 
n = 79 (rounded) values x (tensile strength [kg/mm 2 ] of 
steel sheets of 0.3 mm thickness), a = a(x) = absolute 
frequency. (Take the first two values together, also the 
last three, to get K = 5.) 


X 57 

58 

59 

60 

61 

62 

63 

64 

a 1 4 

10 

17 

27 

8 

9 

3 

1 


9. In a sample of 100 patients having a certain disease 45 
are men and 55 women. Does this support the claim 
that the disease is equally common among men and 
women? Choose a = 5%. 

10. In Prob. 9 find the smallest number (>50) of women 
that leads to the rejection of the hypothesis on the levels 
5%, 1%. 0.5%. 

11. Verify the calculations in Example 1 of the text. 

12. Does the random variable X = Number of accidents 
per week in a certain foundry have a Poisson 
distribution if within 50 weeks, 33 were accident-free, 
I accident occurred in 1 1 of the 50 weeks, 2 in 6 of 
the weeks and more than 2 accidents in no week? 
(Choose a = 5%.) 

13. Using the given sample, test that the corresponding 
population has a Poisson distribution, a is the number 
of alpha particles per 7.5-sec intervals observed by E. 
Rutherford and H. Geiger in one of their classical 
experiments in 1910, and a(x) is the absolute frequency 
(= number of time periods during which exactly x 
particles were observed). (Use a = 5%.) 


.V 

0 

1 

2 

3 

4 

5 

6 

a 

57 

203 

383 

525 

532 

408 

273 

X 

7 

8 

9 

10 

II 

12 

^13 

a 

139 

45 

27 

10 

4 

2 

0 


14. Can we assert that the traffic on the three lanes of an 
expressway (in one direction) is about the same on each 
lane if a count gives 910, 850, 720 cars on the right, 
middle, and left lanes, respectively, during a particular 
time interval? (Use a = 5%.) 

15. If it is known that 25% of certain steel rods produced 
by a standard process will break when subjected to a 


load of 5000 lb, can we claim that a new process yields 
the same breakage rate if we find that in a sample of 
80 rods produced by the new process, 27 rods broke 
when subjected to that load? (Use a = 5%.) 

16. Three samples of 200 rivets each were taken from a 
large production of each of three machines. The 
numbers of defective rivets in the samples were 7, 8, 
and 12. Is this difference significant? (Use a = 5%.) 

17. In a table of properly rounded function values, even 
and odd last decimals should appear about equally 
often. Test this for the 90 values of J x (x) in Table A I 
in App. 5. 

18. Are the 5 tellers in a certain bank equally time-efficient 
if during the same time interval on a certain day they 
serve 120, 95, 1 10, 108, 102 customers? (Useot = 5%.) 

19. CAS EXPERIMENT. Random Number Generator. 
Check your generator experimentally by imitating 
results of n trials of rolling a fair die, with a convenient 
n (e.g.. 60 or 300 or the like). Do this many times and 
see whether you can notice any “nonrandomness” 
features, for example, too few Sixes, loo many even 
numbers, etc., or whether your generator seems to work 
properly. Design and perform other kinds of checks. 

20. TEAM PROJECT. Difficulty with Random 
Selection. 77 students were asked to choose 3 of the 
integers 1 1 , 1 2, 1 3, • • • , 30 completely arbitrarily. The 
amazing result was as follows. 

Number 11 12 13 14 15 16 17 18 19 20 

Frequ. 1 1 10 20 8 13 9 21 9 16 8 

Number 21 22 23 24 25 26 27 28 29 30 

Frequ. 12 8 15 10 10 9 12 8 13 9 

If the selection were completely random, the following 
hypotheses should be true. 

(a) The 20 numbers are equally likely. 

(b) The 10 even numbers together are as likely as the 
10 odd numbers together. 

(c) The 6 prime numbers together have probability 0.3 
and the 1 4 other numbers together have probability 0.7. 
Test these hypotheses, using a — 5%. Design further 
experiments that illustrate the difficulties of random 
selection. 


25.8 Nonparametric Tests 

Nonparametric tests, also called distribution-free tests, are valid for any distribution. 
Hence they are used in cases when the kind of distribution is unknown, or is known but 
such that no tests specifically designed for it are available. In this section we shall explain 
the basic idea of these tests, which are based on “order statistics” and are rather simple. 
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EXAMPLE 1 


EXAMPLE 2 


If there is a choice, then tests designed for a specific distribution generally give better 
results than do nonparametric tests. For instance, this applies to the tests in Sec. 25.4 for 
the normal distribution. 

We shall discuss two tests in terms of typical examples. In deriving the distributions 
used in the test, it is essential that the distributions from which we sample are continuous. 
(Nonparametric tests can also be derived for discrete distributions, but this is slightly more 
complicated.) 

Sign Test for the Median 

A median of the population is a solution .v = jx of the equation F(x) = 0.5, where F is the distribution function 
of the population. 

Suppose that eight radio operators were tested, first in rooms without air-conditioning and then in 
air-conditioned rooms over the same period of time, and the difference of errors (unconditioned minus 
conditioned) were 

9 4 0 6 4 0 7 11. 

Test the hypothesis p = 0 (that is, air-conditioning has no effect) against the alternative p > 0 (that is, inferior 
performance in unconditioned rooms). 

Solution. We choose the significance level a = 5%. If the hypothesis is true, the probability p of a positive 
difference is the same as that of a negative difference. Hence in tills case, p = 0.5, and the random variable 

X = Number of positive values among n values 

has a binomial distribution with p = 0.5. Our sample has eight values. We omit the values 0, which do not 
contribute to the decision. Then six values are left, all of which are positive. Since 

P{X = 6) = (0.5) 6 (0.5)° 

= 0.0156 
= 1.56% 


we do have observed an event whose probability is very small if the hypothesis is true; in fact 1.56% < a = 5%. 
Hence we assert that the alternative p > 0 is true. That is. the number of errors made in unconditioned rooms 
is significantly higher, so that installation of air conditioning should be considered. H 

Test for Arbitrary Trend 

A certain machine is used for cutting lengths of wire. Five successive pieces had the lengths 

29 31 28 30 32. 

Using this sample, test the hypothesis that there is no trend, that is, the machine does not have the tendency to 
produce longer and longer pieces or shorter and shorter pieces. Assume that the type of machine suggests the 
alternative that there is positive trend, that is, there is the tendency of successive pieces to get longer. 

Solution . We count the number of transpositions in the sample, that is. the number of times a larger value 
precedes a smaller value: 


29 precedes 28 ( 1 transposition), 

31 precedes 28 and 30 (2 transpositions). 

The remaining three sample values follow in ascending order. Hence in the sample there are 1 4- 2 = 3 
transpositions. We now consider the random variable 

T = Number of transpositions. 

If the hypothesis is true (no trend), then each of the 51 = 120 permutations of five elements 1 2 3 4 5 has the 
same probability (1/120). We arrange these permutations according to their number of transpositions: 
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7=0 

1 2 3 4 5 


From this we obtain 


7= 1 


7=2 


7= 3 


12 3 5 4 

1 2 4 3 5 

13 2 4 5 

2 13 4 5 


2 

2 

3 

3 

4 
1 
1 
3 
1 


1 4 

2 4 


1 

1 

1 

1 

1 
1 

2 
2 
2 
2 
2 
3 
3 

3 

4 


etc. 


= 3) 120 + 120 120 “P K?0 ” 120 “ 24%. 


We accept the hypothesis because we have observed an event that has a relatively large probability (certainly 
much more than 5%) if the hypothesis is true. 

Values of the distribution function of T in the case of no trend are shown in Table A12, App. 5. For instance, 
if n = 3, then 7( 0) = 0.167, 7(1) = 0.500, 7(2) = 1 - 0.167. If n = 4, then 7(0) = 0.042, 7(1) = 0.167, 
7(2) = 0.375, 7(3) = I - 0.375, 7(4) = 1 - 0.167. and so on. 

Our method and those values refer to continuous distributions. Theoretically, we may then expect that all the 
values of a sample arc different. Practically, some sample values may still be equal, because of rounding: If m 
values are equal, add m(tn — l)/4 (= mean value of the transpositions in the case of the permutations of m 
elements), that is. ^ for each pair of equal values. § for each triple, etc. M 


r 


3EBFB.-E-EM ^SET 


T5.8 


1. What would change in Example 1. had we observed 
only 5 positive values? Only 4? 

2 . Does a process of producing plastic pipes of length 
jjl = 2 meters need adjustment if in a sample. 4 pipes 
have the exact length and 15 are shorter and 3 longer 
than 2 meters? (Use the normal approximation of the 
binomial distribution.) 

3. Do the computations in Prob. 2 without the use of the 
DeMoivre-Laplace limit theorem (in Sec. 24.8). 

4 . Test whether a thermostatic switch is properly set to 
20°C against the alternative that its setting is too low. 
Use a sample of 9 values, 8 of which are less than 20°C 
and I is greater than 20°C. 

5 . Are air filters of type A better than type B filters if in 
10 trials, A gave cleaner air than B in 7 cases, B gave 
cleaner air than A in I case, whereas in 2 of the trials 
the results for A and B were practically the same? 


6. In a clinical experiment, each of 10 patients were given 
two different sedatives A and B. The following table 
shows the effect (increase of sleeping time, measured 
in hours). Using the sign test, find out whether the 
difference is significant. 

A 1.9 0.8 1.1 0.1 -0.1 4.4 5.5 1.6 4.6 3.4 

B 0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0.0 2.0 

Difference 1.2 2.4 1.3 1.3 0.0 1.0 1.8 0.8 4.6 1.4 

7. Assuming that the populations corresponding to the 

samples in Prob. 6 are normal, apply a suitable test for 
the normal distribution. 

8. Thirty new employees were grouped into 15 pairs of 
similar intelligence and experience and were then 
instructed in data processing by an old method (A) 
applied to one (randomly selected) person of each pair, 
and by a new presumably better method (B) applied to 
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the other person of each pair. Test for equality of 
methods against the alternative that (B) is better than 
(A), using the following scores obtained after the end 
of the training period. 


A 

60 

70 

80 

85 

75 

40 

70 

45 

95 

80 

90 

60 

80 

75 

65 

b\ 

65 

85 

85 

80 

95 

65 

100 

60 

90 

85 

1 00 

75 

90 

60 

80 


Temperature T [°C] | 

10 

20 

30 

40 

50 

Reading V [volts] 

99.5 

101.1 

100.4 

100.8 

101.6 


15. In a swine-feeding experiment, the following gains in 
weight [kg] of 10 animals (ordered according to 
increasing amounts of food given per day) were 
recorded: 


9. Assuming normality, solve Prob. 8 by a suitable test 
from Sec. 25.4. 

10. Set up a sign test for the lower quartile <725 (defined by 
the condition F(q 2 5) = 0.25). 

11. How would you proceed in the sign test if the 
hypothesis is jX = /x 0 (any number) instead of jl - 0? 

12. Check the table in Example 2 of the text. 

13. Apply the test in Example 2 to the following data 
(a* = disulfide content of a certain type of wool, 
measured in percent of the content in unreduced fibers; 
y = saturation water content of the wool, measured in 
percent). Test for no trend against negative trend. 


20 17 19 18 23 16 25 28 24 22. 

Test for no trend against positive trend. 

16. Apply the test explained in Example 2 to the following 
data (a* = diastolic blood pressure [mm Hg]. y = weight 
of heart [in grams] of 10 patients who died of cerebral 
hemorrhage). 


A* 

121 

120 

95 

123 

140 

112 

92 

100 

102 

91 

y 

521 

465 

352 

455 

490 

388 

301 

395 

375 

418 


17. Does an increase in temperature cause an increase of 
the yield of a chemical reaction from which the 
following sample was taken? 


L\ 

10 

15 

30 

40 

50 

55 

80 

100 

Temperature [°C] 

10 

20 

30 

40 

60 

80 

y 

50 

46 

43 

42 

36 

39 

37 

33 

Yield [kg/min] 

0.6 

1.1 

0.9 

1.6 

1.2 

2.0 


14. Test the hypothesis that for a certain type of voltmeter, 
readings are independent of temperature T [°C] against 
the alternative that they tend to increase with T. Use a 
sample of values obtained by applying a constant 
voltage: 


18. Does the amount of fertilizer increase the yield of 
wheat X [kg/plot]? Use a sample of values ordered 
according to increasing amounts of fertilizer: 

41.4 43.3 39.6 43.0 44.1 45.6 44.5 46.7. 


25.9 Regression. Fitting Straight Lines. 
Correlation 

So far we were concerned with random experiments in which we observed a single quantity 
(random variable) and got samples whose values were single numbers. In this section we 
discuss experiments in which we observe or measure two quantities simultaneously, $0 
that we get samples of pairs of values (x l9 y\) 9 (a 2 , y 2 ), • * • , (x n , y n ). Most applications 
involve one of two kinds of experiments, as follows. 

1. In regression analysis one of the two variables, call it x , can be regarded as an 
ordinary variable because we can measure it without substantial error or we can even 
give it values we want, x is called the independent variable, or sometimes the 
controlled variable because we can control it (set it at values we choose). The other 
variable, K, is a random variable, and we are interested in the dependence of Y on 
-v. Typical examples are the dependence of the blood pressure Y on the age x of a 
person or, as we shall now say, the regression of Y on a*, the regression of the gain 
of weight Y of certain animals on the daily ration of food x 9 the regression of the 
heat conductivity Y of cork on the specific weight a* of the cork, etc. 
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2. In correlation analysis both quantities are random variables and we are interested 
in relations between them. Examples are the relation (one says “correlation”) between 
wear X and wear Y of the front tires of cars, between grades X and Y of students in 
mathematics and in physics, respectively, between the hardness X of steel plates in 
the center and the hardness Y near the edges of the plates, etc. 


Regression Analysis 

In regression analysis the dependence of Y on x is a dependence of the mean p, of Y on 
a, so that fi = p(x) is a function in the ordinary sense. The curve of p{x) is called the 
regression curve of Y on x. 

In this section we discuss the simplest case, namely, that of a straight regression line 
( 1 ) p(x) = k 0 -I- k x x. 

Then we may want to graph the sample values as n points in the A'T-plane, fit a straight 
line through them, and use it for estimating p{x) at values of x that interest us, so that we 
know what values of Y we can expect for those a. Fitting that line by eye would not be 
good because it would be subjective; that is, different persons’ results would come out 
differently, particularly if the points are scattered. So we need a mathematical method that 
gives a unique result depending only on the n points. A widely used procedure is the method 
of least squares by Gauss and Legendre. For our task we may formulate it as follows. 


Least Squares Principle 

The straight line should be fitted through the given points so that the sum of the 
squares of the distances of those points from the straight line is minimum , where 
the distance is measured in the vertical direction {the y-di reel ion). (Formulas below.) 


To get uniqueness of the straight line, we need some extra condition. To see this, take 
the sample (0, 1), (0, —I). Then all the lines y = k x x with any k x satisfy the principle. 
(Can you see it?) The following assumption will imply uniqueness, as we shall find out. 


General Assumption (A1) 

The x- values x x , • • • , x n in our sample (x x , y x ), • • • , (x n , y n ) are not all equal. 


From a given sample (x ly > 4 ), • • • , ( x n , y n ) we shall now determine a straight line by 
least squares. We write the line as 

( 2 ) y = k 0 + k x x 

and call it the sample regression line because it will be the counterpart of the population 
regression line ( 1 ). 

Now a sample point (a,, yj) has the vertical distance (distance measured in the 
y-direction) from ( 2 ) given by 

\yj ~ (^o + (see Fig. 542). 
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Fig. 542. Vertical distance of a point (x jt y y ) from a straight line y = k 0 + k } x 
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EXAMPLE 1 


From (5) we see that the sample regression line passes through the point (jc, y), by which it 
is determined, together with the regression coefficient (7). We may call s x ? the variance of 
the A*-values, but we should keep in mind that a- is an ordinary variable, not a random variable. 
We shall soon also need 


(9b) 


* 2 = 
*> a 


n — 1 


2 (» - v) 2 = 


j=i 


n — 1 



Derivation of (5) and (7). Differentiating (3) and using (4), we first obtain 


- = -22 (.Vj - *o - *i Xj) = 0, 

= -2 2 x /)’j - *o - ki*j) = 0 

where we sum over j from 1 to n. We now divide by 2, write each of the two sums as 
three sums, and take the sums containing yj and Xjyj over to the right. Then we get the 

“normal equations” 


( 10 ) 


k 0 n + 2 x j = 2 Vj 

2 x j k\ 2 x j ~ 2 x jyy 


This is a linear system of two equations in the two unknowns k 0 and k x . Its coefficient 
determinant is [see (9)] 





= n(n - 1 )s x 2 = /? 2 ( x j ~ f') 2 


and is not zero because of Assumption (Al). Hence the system has a unique solution. 
Dividing the first equation of (10) by n and using (6), we get k 0 = y — k x x. Together with 
y = + ki x > n (2) this gives (5). To get (7), we solve the system (10) by Cramer’s rule 

(Sec. 7.6) or elimination, finding 


(ID 


*i = 


n 2 x j) : j 2 x i 2 .Vj 
n(n - 1 )s x 2 


This gives (7)-(9) and completes the derivation. [The equality of the two expressions in 
(8) and in (9) may be shown by the student; see Prob. 14]. ■ 


Regression Line 

The decrease of volume y [%] of leather for certain fixed values of high pressure a* [atmospheres! was measured. 
The results are shown in the first two columns of Table 25.1 1. Find the regression line of v on .v. 

Solution . We see that n = 4 and obtain the values .v = 28 000/4 = 7000, v = 19.0/4 = 4.75. and from (9) 
and (8) 
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Table 25.11 Regression of the Decrease of Volume y [%] 
of Leather on the Pressure x [Atmospheres] 


Given Values 

Auxiliary Values 

x j 

yj 

y 2 

Xj 

Ws 

4 000 

23 

16 000 000 

9 200 

6 000 

4.1 

36 000 000 

24 600 

8 000 

5.7 

64 000 000 

45 600 

10 000 

6.9 

100 000 000 

69 000 

28 000 

19.0 

216 000 000 

148 400 


2 I / „ 28 000 2 \ 20000000 

•v* = J (216 000000 — j = 

I (148400 ?—)= — • 

Hence = 15 400/20 000 000 = 0.000 77 from (7). and the regression line is 

v - 4.75 = 0.000 77 (.v - 7000) or y = 0.000 77 x - 0.64. 

Note thaty(0) = -0.64, which is physically meaningless, but typically indicates that a linear relation is merely 
an approximation valid on some restricted interval. ■ 

Confidence Intervals in Regression Analysis 

If we want to get confidence intervals, we have to make assumptions about the distribution 
of Y (which we have not made so far; least squares is a “geometric principle,” nowhere 
involving probabilities!). We assume normality and independence in sampling: 

Assumption (A2) 

For each fixed x the random variable Y is normal with mean (1), that is, 

( 12 ) fl(x) = K 0 + KyX 

and variance a 2 independent of x. 

Assumption (A3) 

The n performances of the experiment by which we obtain a sample 


Ui, Ji)> 0*2* V 2 X • • • • (* n , yj 


are independent. 

k x in (12) is called the regression coefficient of the population because it can be shown 
that under Assumptions (Al)— (A3) the maximum likelihood estimate of /cj is the sample 
regression coefficient ^ given by (1 1). 

Under Assumptions (A1)-(A3) we may now obtain a confidence interval for iq, as 
shown in Table 25.12. 
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Table 25.12 Determination of a Confidence Interval for #c , in (1) under Assumptions 
(Al) (A3) 


Step 1. Choose a confidence level y(95%, 99%, or the like). 
Step 2 . Determine the solution c of the equation 

(13) F(c) =|(1 + y) 


from the table of the /-distribution with n — 2 degrees of freedom (Table A9 in 
App. 5; n = sample size). 

Step 3 . Using a sample (x ly y x ), * • • , (* n , v n ), compute (n - 1 )s x 2 from (9a), (n — 1 )s xy 
from (8), k x from (7), 


(14) 


(« - 1 )Sy 2 = 2 .V/ 
3=1 



[as in (9b)], and 


(15) 


CJO = (» - l)(l'y 2 - ^l 2 ^ 2 ). 


Step 4. Compute 


K = c 

V (» - 


<7o 


2 )(n - Os* 2 


The confidence interval is 


(16) CONF r - K ^ /<! ^ k x + K}. 


Confidence Interval for the Regression Coefficient 

Using the sample in Table 25.1 1. determine a confidence interval for kj by the method in Table 25.12. 
Solution . Step L We choose y = 0.95. 

Step 2. Equation (13) takes the form F(c) = 0.975, and Table A9 in App. 5 with n - 2 = 2 degrees of freedom 
gives c = 4.30. 

Step 3 . From Example 1 we have 3 s * = 20 000 000 and k 1 = 0.00077. From Table 25.1 1 we compute 

I9 2 

•V = 102.2 - — 

= 1 1.95, 

r/ 0 = 11 .95 - 20 000 000 • 0.00077 2 
= 0.092. 

Step 4. We thus obtain 

K = 4.30 V0 092/(2 • 20 000 000) 

= 0.000 206 
and 

CONFq. 95 ( 0.00056 ^ s 0.00098}. ■ 
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Correlation Analysis 

We shall now give an introduction to the basic facts in correlation analysis; for proofs see 
Ref. [G2] or [G8] in App. 1 . 

Correlation analysis is concerned with the relation between X and Y in a two-dimensional 
random variable (X y Y) (Sec. 24.9). A sample consists of n ordered pairs of values 
( A T> 7i)» • * • » (.v w , y n ), as before. The interrelation between the x and y values in the 
sample is measured by the sample covariance s xy in (8) or by the sample correlation 
coefficient 


(17) 



s x s y 


with s x and s y given in (9). Here r has the advantage that it does not change under a 
multiplication of the x and y values by a factor (in going from feet to inches, etc.). 


Sample Correlation Coefficient 

The sample correlation coefficient r satisfies — 1 = r ^ l. In particular \ r = ±\ if 
and only if the sample values lie on a straight line. (See Fig. 543.) 


The theoretical counterpart of r is the correlation coefficient p of X and 7, 


(18) 


a XY 

a x (ry 


where /jl x = E(X), fj. Y = E(Y), ar x 2 = E([X - /x x ] z ), <x Y 2 = £([F - /x Y ] 2 ) (the means 
and variances of the marginal distributions of X and Y; see Sec. 24.9), and a XY is the 




r = 0.98 
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Fig. 543. Samples with various values of the correlation coefficient r 
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THEOREM 


THEOREM 


EXAMPLE 3 


covariance of X and Y given by (see Sec. 24.9) 

(19) (Txy = E([X - fi x ][Y - fiy]) = E(XY) - E(X)E(Y). 

The analog of Theorem 1 is 


Correlation Coefficient 

The correlation coefficient p satisfies — \^=pt=zl.In particular , p = ±1 if and 
only ifX and Y are linearly related, that is, Y = yX + 8, X = y*Y + S*. 


X and Y are called uncorrelated if p = 0. 


Independence. Normal Distribution 

(a) Independent X and Y (see Sec. 24.9) are uncorrelated . 

(b) If (X, Y) is normal (see below), then uncorrelated X and Y are 
independent . 


Here the two-dimensional normal distribution can be introduced by taking two independent 
standardized normal random variables X*, Y* y whose joint distribution thus has the density 


( 20 ) 


/*(**, y*) 


2t r 


(representing a surface of revolution over the x*y*-plane with a bell-shaped curve as cross 
section) and setting 

* = Mx + ox** 

K = Mr + PVy** + Vl - p 2 0yy* 

This gives the general two-dimensional normal distribution with the density 

1 


(21a) 
where 
(21b) h(x. 


f(x,y) = 


2tT<J x CTy V 1 — p 2 


e -Mx,y)f 2 




In Theorem 3(b), normality is important, as we can see from the following example. 

Uncorrelated but Dependent Random Variables 

If X assumes -1,0, ! with probability 1/3 and Y - X 2 , then E( X) = 0 and in (3) 


<7*y = E(XY) = E(X 3 ) = (-I) 3 - J + 0 3 - J + I 3 - y = o, 

so that p = 0 and X and Y are uncorrelated. But they are certainly not independent since they are even functionally 
related. ■> 
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Test for the Correlation Coefficient p 

Table 25.13 shows a test for p in the case of the two-dimensional normal distribution. 
t is an observed value of a random variable that has a /-distribution with n — 2 degrees 
of freedom. This was shown by R. A. Fisher (. Biometrika 10 (1915), 507-521). 

Table 25.13 Test of the Hypothesis p = 0 Against the Alternative p > 0 in the Case 
of the Two-Dimensional Normal Distribution 


Step L Choose a significance level a. (5%, 1%, or the like). 
Step 2. Determine the solution c of the equation 


P(T^c) = l - a 


from the /-distribution (Table A9 in App. 5) with n — 2 degrees of freedom. 
Step 3. Compute r from (17), using a sample (. x ly y 2 ), • • • , (x ni y n ). 

Step 4 . Compute 


/ = r 



If / ^ c, accept the hypothesis. If t > c, reject the hypothesis. 


EXAMPLE 4 Test for the Correlation Coefficient p 

Test the hypothesis p = 0 (independence of X and Y, because of Theorem 3) against the alternative p > 0, using 
the data in the lower left corner of Fig. 543, where r = 0.6 (manual soldering errors on 10 two-sided circuit 
boards done by 1 0 workers; a- = front, y — back of the boards). 

Solution . We choo se a. = 5%; thus 1 - a = 95%. Since n = 10, n - 2 = 8, the table gives c = 1.86. 
Also, t = 0.6V8/0.64 = 2.12 > c. We reject the hypothesis and assert that there is a positive correlation. A 
worker making few (many) errors on the front side also tends to make few (many) errors on the reverse side of 
the board. ■ 


StT 


1 1 - 1 0 1 SAMPLE REGRESSION LINE 

Find and sketch or graph the sample regression line of y 
and x and the given data as points on the same axes. 

1. (-1. 0,(0, 1.7), (1,3) 

2. (3, 3.5), (5, 2), (7, 4.5), (9, 3) 

3. (2, 12), (5, 24), (9, 33), (14, 50) 

4. (11, 22), (15, 18), (17, 16), (20, 9), (22, 10) 

5. Speed x [mph] of a car 30 40 50 60 

Stopping distance y [ft] 150 195 240 295 

Also find the stopping distance at 35 mph. 


6. x = Deformation of a certain steel [mm], y = Brinell 
hardness [kg/mm 2 ] 

,v 6 9 11 13 22 26 28 33 35 

y 68 67 65 53 44 40 37 34 32 

7. ,v = Revolutions per minute, v = Power of a Diesel 
engine [hp] 

.v 400 500 600 700 750 

y 580 1030 1420 1880 2100 
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8. Humidity of air x [%] 

10 

20 

30 

40 

Expansion of gelatin y [%] 

0.8 

1.6 

2.3 

2.8 

9. Voltage a* [V] 40 40 

80 

80 

no 

110 

Current y [A] 5.1 4.8 

10.0 

10.3 

13.0 12.7 

Also find the resistance 
(Sec. 2.9]. 

R [fl] 

by 

Ohms’ law 

10. Force a* [lb] 

2 

4 

6 

8 

Extension y [in] of a spring 

4.1 

7.8 

12.3 15.8 


Also find the spring modulus by Hooke’s law 
(Sec. 2.4). 


11-13 


CONFIDENCE INTERVALS 


Find a 95% confidence interval for the regression 
coefficient k,, assuming that (A2) and (A3) hold and using 
the sample: 


11. In Prob. 6 


12. In Prob. 7 


13. In Prob. 8 


14. Derive the second expression for s. 2 in (9a) from the 
First one. 

15. CAS EXPERIMENT. Moving Data. Take a sample, 
for instance, that in Prob. 6, and investigate and graph 
the effect of changing y - values (a) for small a*, (b) for 
large a, (c) in the middle of the sample. 




STIONS AND PROBLEMS 


1. What is a sample? Why do we take samples? 

2. What is the role of probability theory in statistics? 

3. Will you get better results by taking larger samples? 
Explain. 

4. Do several samples from a certain population have the 
same mean? The same variance? 

5. What is a parameter? How can we estimate it? Give an 
example. 

6. What is a statistical test? What errors occur in testing? 

7. How do we test in quality control? 

8. What is the * 2 -test? Give a simple example from 
memory. 

9. What are nonparametric tests? When would you apply 
them? 

10. In what tests did we use the /-distribution? The 
^-distribution? 

11. What are one-sided and two-sided tests? Give typical 
examples. 

12. List some areas of application of statistical tests. 

13. What do we mean by “goodness of fit”? 

14. Acceptance sampling uses principles of testing. Explain. 

15. What is the power of a test? What can you do if the 
power is low? 

16. Explain the idea of a maximum likelihood estimate from 
memory. 

17. How does the length of a confidence interval depend on 
the sample size? On the confidence level? 


18. Couldn’t we make the error in interval estimation zero 
simply by choosing the confidence level I ? 

19. What is the least squares principle? Give applications. 

20. What is the difference between regression and 
correlation analysis? 

21. Find the maximum likelihood estimates of mean and 
variance of a normal distribution using the sample 5, 4, 
6, 5, 3, 5, 7, 4, 6, 5, 8, 6. 

22. Determine a 95% confidence interval for the mean jx of 
a normal population with variance cr 2 = 16, using a 
sample of size 400 with mean 53. 

23. What will happen to the length of the interval in Prob. 
22 if we reduce the sample size to 100? 

24. Determine a 99% confidence interval for the mean of a 
normal population with standard deviation 2.2, using the 
sample 28, 24, 31, 27, 22. 

25. What confidence interval do we obtain in Prob. 24 if 
we assume the variance to be unknown? 


26. Assuming normality, find a 95% confidence interval for 
the variance from the sample 145.3. 145.1. 145.4. 146.2. 


27-29 


Find a 95% confidence interval for the mean fx. 


assuming normality and using the sample: 


27. Nitrogen content [%] of steel 0.74, 0.75, 0.73, 0.75, 
0.74, 0.72 

28. Diameters of 10 gaskets with mean 4.37 cm and 
standard deviation 0.157 cm 

29. Density fg/cm 3 ] of coke 1.40, 1.45, 1.39, 1.44, 1.38 
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30. What sample size should we use in Prob. 28 if we want 
to obtain a confidence interval of length 0.1, assuming that 
the standard deviation of the samples is (about) the same? 

1 3 1-32 1 Find a 99% confidence interval for die variance 

cr 2 , assuning normality and using the sample: 

31. Rockwell hardness of tool bits 64.9, 64.1, 63.8, 64.0 

32. A sample of size n = 128 with variance s 2 = 1.921 

33. Using a sample of 10 values with mean 14.5 from a 
normal population with variance cr 2 = 0.25, test the 
hypothesis /iq = 15.0 against the alternative fx x - 14.4 
on the 5% level. 

34. In Prob. 33, change the alternative to /x ^ 15.0 and test 
as before. 

35. Find the power in Prob. 33. 

36. Using a sample of 15 values with mean 36.2 and 
variance 0.9, test the hypothesis fx 0 = 35.0 against the 
alternative /x x = 37.0, assuming normality and taking 
a = 1%. 

37. Using a sample of 20 values with variance 8.25 from a 
normal population, test the hyothesis a 0 2 = 5.0 against 
the alternative a 2 = 8.1, choosing a = 5%. 

38. A firm sells paint in cans containing 1 kg of paint per 
can and is interested to know whether the mean weight 
differs significantly from 1 kg, in which case the filling 
machine must be adjusted. Set up a hypothesis and an 
alternative and perform the test, assuming normality and 
using a sample of 20 fillings having a mean of 991 g 
and a standard deviation of 8 g. (Choose a = 5%.) 

39. Using samples of sizes 10 and 5 with variances $* = 50 
and s* = 20 and assuming normality of the corresponding 
populations, test the hypothesis H 0 : a 2 - cr 2 against 
the alternative or 2 > cr 2 . Choose a = 5%. 


40. Assume the thickness X of washers to be normal with 
mean 2.75 mm and variance 0.00024 mm 2 . Set up a 
control chart for ix . choosing a = 1%, and graph the 
means of the five samples (2.74, 2.76), (2.74, 2.74), 
(2.79, 2.81), (2.78, 2.76), (2.71, 2.75) on the chart. 

41. What effect on UCL — LCL in a control chart for the 
mean does it have if we double the sample size? If we 
switch from a = 1 % to a = 5%? 

42. The following samples of screws (length in inches) were 
taken from an ongoing production. Assuming that the 
population is normal with mean 3.500 and variance 
0.0004, set up a control chart for the mean, choosing 
a = 1 %, and graph the sample means on the chart. 

Sample No. I 2 3 4 5 6 7 8 

, 3.49 3.48 3.52 3.50 3.51 3.49 3.52 3.53 

LCngtft 3.50 3.47 3.49 3.51 3.48 3.50 3.50 3.49 

43. A purchaser checks gaskets by a single sampling plan 
that uses a sample size of 40 and an acceptance number 
of 1 . Use Table A6 in App. 5 to compute the probability 
of acceptance of lots containing the following 
percentages of defective gaskets £%, £%, 1 %, 2%, 5%, 
10%. Graph the OC curve. (Use the Poisson 
approximation.) 

44. Does an automatic cutter have the tendency of cutting 
longer and longer pieces of wire if the lengths of 
subsequent pieces [in.] were 10.1, 9.8, 9.9, 10.2, 10.6, 
10.5? 

45. Find the least squares regression line to the data (-2, I ), 
(0. 1). (2, 3), (4. 4), (6, 5). 


SUMMARY OJ- XHAI!TIR_~25_ 

Mathematical Statistics 


We recall from Chap. 24 that with an experiment in which we observe some quantity 
(number of defectives, height of persons, etc.) there is associated a random variable 
X whose probability distribution is given by a distribution function 

(1) F(x) = P(X ^ x) (Sec. 24.5) 

which for each a* gives the probability that X assumes any value not exceeding x. 
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In statistics we take random samples jc lf • • • , x n of size n by performing that 
experiment n times (Sec. 25.1) and draw conclusions from properties of samples 
about properties of the distribution of the corresponding X. We do this by calculating 
point estimates or confidence internals or by performing a test for parameters 
(/x and a 2 in the normal distribution, p in the binomial distribution, etc.) or by a 
test for distribution functions. 

A point estimate (Sec. 25.2) is an approximate value for a parameter in the 
distribution of X obtained from a sample. Notably, the sample mean (Sec. 25.1) 


1 1 

(2) X = — X x j = — C*l + * ‘ ‘ + x n) 

j - 1 

is an estimate of the mean /x of X , and the sample variance (Sec. 25.1) 


J 1 

(3) s 2 = - - 2 (*j ~ x) Z = ~ _ 7 fUi - .v) 2 + • • • + (x n - x) 2 \ 

11 1 j= i n 

is an estimate of the variance a 2 of X. Point estimation can be done by the basic 
maximum likelihood method (Sec. 25.2). 

Confidence intervals (Sec. 25.3) are intervals 0* ^ 0 ^ 0 2 with endpoints 
calculated from a sample such that with a high probability y we obtain an interval 
that contains the unknown true value of the parameter 0 in the distribution of X . 
Here, y is chosen at the beginning, usually 95% or 99%. We denote such an interval 
by CONF y {0! ^0^ 0 2 }. 

In a test for a parameter we test a hypothesis 0 = 0 O against an alternative 0 = 6 X 
and then, on the basis of a sample, accept the hypothesis, or we reject it in favor of 
the alternative (Sec. 25.4). Like any conclusion about X from samples, this may 
involve errors leading to a false decision. There is a small probability a (which we 
can choose, 5% or 1%, for instance) that we reject a true hypothesis, and there is a 
probability p (which we can compute and decrease by taking larger samples) that 
we accept a false hypothesis, a is called the significance level and 1 — /3 the power 
of the test. Among many other engineering applications, testing is used in quality 
control (Sec. 25.5) and acceptance sampling (Sec. 25.6). 

If not merely a parameter but the kind of distribution of X is unknown, we can 
use the chi-square test (Sec. 25.7) for testing the hypothesis that some function 
F(x ) is the unknown distribution function of X. This is done by determining the 
discrepancy between F( x) and the distribution function F(x) of a given sample. 

“Distribution-free” or nonparametric tests are tests that apply to any distribution, 
since they are based on combinatorial ideas. These tests are usually very simple. 
Two of them are discussed in Sec. 25.8. 

The last section deals with samples of pairs of values, which arise in an 
experiment when we simultaneously observe two quantities. In regression analysis, 
one of the quantities, x , is an ordinary variable and the other, T, is a random variable 
whose mean /x depends on a*, say, /x( x) = k 0 + k x x. In correlation analysis the 
relation between X and Tin a two-dimensional random variable (X, Y) is investigated, 
notably in terms of the correlation coefficient p. 
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APPENDIX 2 
Answers to 

Odd-Numbered Problems 


Problem Set 1.1, page 8 

1. (cos tta) / 77 + c 3. e ^ 12 + c 5. First order 

7. Second order 9. Third order 

11. y = \ tan (2a* + nif), n = 0, ±1, ±2, • • • 

13. y = e~ x * 15. (A) No. (B) No. Only y = 0. 

17. y" = g, y = gt, y = gt 2 / 2 

19. y" = A, y' = kt + 6, 3- = \kt 2 + 6 1, 3<60) = 1800* + 360 = 3000, k = 1.47, 

>•'(60) = 1.47 • 60 + 6 = 94 [m/sec] = 210 [mph] 

21. e kH = H = (In §)/A = (10 u In 2)/1.4 = 1570 [years] 

Problem Set 1.2, page 11 

11. y — —(2 hr) cos + c 15. y = x(l - In x) + c 

17. Verify the general solution y 2 + t 2 = c. Circle of radius 3V2 
19. mv' = mg — bv 2 , v' = 9.8 - v 2 , u(0) = 10. v' = 0 gives the limit 
V9^8 = 3.1 [meter/sec]. 


Problem Set 1.3, page 18 


3. cos 2 y dy = 2 dx, y = \ arcsin (4 jc + c) 
7. dy/y = cot ttx dx, y = c(sin ttx) 11 " 

11 . r = r 0 e -t2 

15. y = e x l\/2x + 5 

19. y = Vln (a - 2 -Zx+ e) 


5. y 2 + 36x 2 = c, ellipses 
9. y = tan (c — e~ irx h t) 
13. J = / 0 <? _Rt/L 
17. 3' = 4 In x 


21 . y' = (3’ - b)Kx - a), y - b = c(x — a) 

23. y 0 e fc = 2y 0 , c ,c = 2 (1 week), e 2k = 2 2 (2 weeks), e 4fc = 2 4 

25. y = 3' 0 e kt = y 0 c-° 0001213t = yoe -00001213 ' 4000 = 0.62y o ; 62%; cf. Example 2. 

27. y' = -Ay, y = y 0 e~ kt , e~ 5k = 0.5, A = -(In 0.5)/5 = 0.139, 

1 = -(In 0.05)/0. 139 = 22 [min] 

29. T(0) = 10, T = 23 - I3e kt , 7(2) = 23 - 13c 2fc = 18, A = -0.478, T = 22.8 
gives t = [In ( — 0.2/— 1 3)] /( — 0.478) = 8.73 [min]. 


31. h = gt 2 / 2, f = V2A/g, v = gt = gV2h/g = V2gA 

33. y' = 0 - (2/800)y, y = 200e-°° 025t , t = 300 [min], y(300) = 94.5 [lb] 

35. (A) is related to the error function and (C) concerns the Fresnel integral C(jc); see 
App. 3.1. (D) y' = 2*y + 1, y(0) = 0 
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Problem Set 1.4, page 25 
1. Exact, .v 4 4- y 4 = c 

3. Exact, u = cos irx sinh y 4- k(y), u y = cos ttx cosh v 4 k', k r = 0. 

Arts, cos ttx sinh y = c 
5. Exact. 9x 2 4 4 y 2 = c 

7. Exact, M 0 = N r = -2e~ 2 \ u = re~ 20 4 Jt(0), u 0 = -2 re~ 20 4 it', it' = 0. 

A/w. re“ 20 = c, r = ce 20 

9. Exact, u = y/x 4- sin 2x 4- k(y), u y = 1/a* 4 - k r = 1/a* — 2 sin 2y. 

A/w. v/a* 4 sin 2 a* 4- cos 2y = c 

11. Not exact. F = 1/a 2 by Theorem 1. —y/x 2 dx 4 1/a* cly = d(y/x) = 0. Ans. y = ca 
13. —3y 2 /x 4 dx 4 2 y/x 3 dy = d(y 2 /x 3 ) = 0. y = ca 372 (semicubical parabolas) 

15. Exact, u — e 2 * cos y 4 k(y\ u y — -e 2x sin y 4 k' 9 k' =0. Ans. e 2 * cos y = c, 
c = 1 

17. Not exact. Try R. F = e _x , ^(cos oar 4 w sin cox) dx 4 dy = 0, it = y 4 /(a*), 
m* = /' = e^Ccos awr 4 sin tax), u = y 4 l = y — e~ x cos <ox — c\ c = 0 

19. u = e ,T 4 £(y), u y = k f = — 14 e v , k = — v 4 e*. A/is. e ,v — y 4 e v = c 
21. B = C, |Aa 2 4 Cxy 4 ^Dy 2 = c 

Problem Set 1.5, page 32 

3. y = -3 5.r + 0 8 5 . y = 2.6<T 125r 4 4 

7. y = a* 4 c (if k = 0). y = ce~ kx 4 e 2kx /3k if k * 0 
9. Separate, y - 2.5 = c cosh 4 1.5a* 11. y = 2xe cos 2x 

13. y = sin 2 a 4 c/sin 2 2a, c = 1 15. y = e lfx (x 2 4 c), c = 4. 1 

17. y = (c 4 I cosh 10 a) /a 3 . Note (A 3 y)' = 5 sinh 10 a. 

19. .v = 1 hi, u = ce~ S 7x - |y 

21. u = y~ 2 = e**(l + ce 2x ), c = 3, «(0) = 4 

23. Separate variables, y 2 = 1 — ce cos x , c = — Me 

25. y' = /?>• + /r, y = ce Rt — k/R, c = y 0 + A//?. y 0 = 1 000, R = 0.06, 

/ = 65 - 25 = 40, k = 1000, y = $178 076.12. Start at 45 gives 
y 0 [(l + l/0.06)<?°° 6 ' 20 - 1/0.06] = 41 ,988732y 0 = 178 076.12, y 0 = k = $4241.05. 
27. y' = 175(0.0001 - y/450), y(0) = 450 • 0.0004 = 0.18, 
y = 0.135e"°' 3889t + 0.045 = 0.18/2, 

e — o.3889t = (0 09 _ o.045)/0.135 = 1/3, 

t = (In 3)/0.3889 = 2.82. Ans. About 3 years 
29. y' = A - ky, y(0) = 0, y = A(1 - e~ kt )/k 

31. y' = By 2 - Ay = By(y - A/B), A > 0, B > 0. Constant solutions y = 0, y = A/B. 
y > 0 if y > A/B (unlimited growth), y' < 0 if 0 < y < A/B (extinction), 
y = A/{ce Al + B), y(0) > A/B if c < 0, y(0) < A/B if c > 0. 

33. y' = y - y 2 - 0.2y, y = 1/(1. 25 - 0.75*T°- 8t ), limit 0.8, limit 1 
35. y 1 = y — 0.25y 2 — 0.1 v = 0.25y(3.6 - v). Equilibrium harvest 3.6, 
y = 18/(5 + ce~ 09t ) 

37. (yj + y 2 )' + p(yi + y 2 ) = OV + pyi) + ( y 2 ' + py 2 ) = 0 + 0 = 0 

39 - O’t + v 2 )' + p(yi + y 2 ) = (yi + py x ) + (y 2 ' + py 2 ) = r + 0 = r 

41. Solution of cy x + pcy x = c(y\ + py x ) = cr 
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43. CAS Experiment (a) y = x sin (1/x) 4- cx. c = 0 \iy{2hr) = 2 hr. y is undefined at 
x = 0, the point at which the “waves” of sin (1/*) accumulate; the factor x 
makes them smaller and smaller. Experiment with various ^-intervals. 

(b) y = jt n [sin (1 /jc) 4- c]. y(2Air) = (2/7r) ? \ n need not be an integer. Try n = 

Try n = - 1 and see how the “waves” near 0 become larger and larger. 

45. y = wy*, y' 4- py = uy* + uy* f 4- pay* = uy* 4- u(y* r 4- py*) = uy * 4- u • 0 
= i\ u = r/y* = re fp dx , u = / ^ /* dx 4- c. Thus, y = uy h gives (4). We shall 

see that tliis method extends to higher-order ODEs (Secs. 2.10 and 3.3). 

Problem Set 1.6, page 36 

1. y' = 4, y' = — 1/4, v = — x/4 4- c* 

3. y/x = c, y'/x = y/x 2 , y' = y/x, y f = —x/y, y 2 + x 2 = c*, circles 

5. 2xy 4- x 2 y r = 0, y r = -2v/a\ y' = x/(2y), y 2 - x 2 /2 = c* 9 hyperbolas 

7. ye~ x * 12 — c, y f = xv, y f = - l/(xy), yy' — — 1/a*, y 2 /2 = —In \x\ + c** t 
x = c*e~y 212 , bell-shaped curves (with a* and y interchanged) 

9. y = — 4x/y, y* = y/4x , 4 hi |y| = In |a| 4- c**, a* = c*y 4 , parabolas 
11. xe~ yH = c, y 7 = 4/x, y f = —a/ 4, y = — a 2 /8 4- c* 

13. Use dyfclx = l/(dx/dy). (y — 2x)e x = c , (y 7 — 2 4- y — 2x)<r x = 0, 
y = 2 — y 4 2 a, dx/dy = — 2 4- y - 2x is linear, 
dx/dy 4- 2a = y - 2, a = c**- 2 * 4- y/2 - 5/4 
15. u = c\ n r dx 4- w^dy = 0, y 7 = —uja y . Trajectories y 7 = w#/w a .. Now u = c*, 
i^-dx 4- y y dy = 0, y 7 = —v x /v y . This agrees with the trajectory ODE in a if 
u x = v y (equal denominators) and u y = — u x (equal numerators). But these are just 
the Cauchy-Riemann equations. 

17. 2a 4- 2yy 7 = 0, y' = — x/y. Trajectories y* = y/x , In |y| = In |a| 4- c**, y = c*x. 
19. y 7 = — 4x/9y. Trajectories y' = 9y/4x. y = c*x m (c* > 0). Sketch or graph these 
curves. 

Problem Set 1.7, page 41 

1. In |x — x 0 | < ct\ just take b in * = b/K large, namely, b = aK. 

3. No. At a common point (x ls y x ) they would both satisfy the “initial condition” 
y(x 3 ) = y 1? violating uniqueness. 

5. y = /(.v, = r(x) — p(x)y ; hence df/dy = — /?(x) is continuous and is thus 

bounded in the closed interval |x — x 0 \ ^ a. 

7. R has sides 2 a and 26 and center (1,1) since v(l) = 1. In R, 

f = 2 y 2 ^ 2(6 + l) 2 = AT, a = b/K = 6/(2(6 4- l) 2 ). da/d6 = 0 gives 6=1, and 
^apt = b/K = 1/8. Solution by dy/y 2 = 2 dx, etc., y = 1/(3 - 2x). 

9. |1 4- 3‘ 2 1 ^ K = 1 + Z? 2 , a = 6/^, da/dZ? = 0, b = 1, a = 1/2. 

Chapter 1 Review Questions and Problems, page 42 

11. dy/(y 2 4- 5) = 4 dx, 2 arctan 2y = 4x 4- c*, y = \ tan (2r 4- c) 

13. Logistic ODE. y = 1/w, v 7 = —it* hi 2 = 4/w — I/w 2 , u = c*^~ 4 ’ T 4- 1 
15. dy/(y 2 4- 1) = x 2 dx, arctan y = x 3 /3 4- c, y = tan (x s /3 4- c) 

17. Bernoulli. y f 4- xy = x/y, u = y 2 , u = 2yy' = 2x - Zvw linear, 
w = ^“* r2 (/^ 2 2x dx 4 c) = 1 4 y = Vm. Or write 
yy 7 = — x(y 2 - 1) and separate. 
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19. Lineal*, y = e cos x (J>“ cos * v sin a* dx 4- c) = ce cos x 4- 1. Or by separation. 

21. Not exact. Use Theorem 1, Sec. 1.4; R = 2/a, F = a 2 ; the resulting exact ODE is 
3a 2 sin 2 y dx + 2a 3 cos 2 y dy = d( a 3 sin 2y), a 3 sin 2y = c. Or by separation, 
cot 2y dy = — 3/(2a) dx ; etc., sin 2v = ca -3 . 

23. Exact. « = JM Ja = sin xy — a 2 + *, u tJ = a cos Ay 4- *' = N. * = y 2 , 
sin at — A 2 4- y 2 = c. 

25. Not exact. R* = 1 in Theorem 2, Sec. 1 .4, F* = e u . Exact is 
sin (y — a) dLv 4- <? y [cos (y — a) — sin (y — a)] dy = 0. 
u = / M dA = ^ cos (y - a) 4- *, u y = <? y (cos (y — x) — sin (v — a )) 4- *' = /V, 
e y cos (y — a) = c. 

27. Separation, y 2 4- a 2 = 25 

29. Separation, v = tan (a 4- c), c = —577 

31. Exact, u = a 2 v 2 4- cos x 4- 2y = t\ c = u(0, 1 ) = 3 

33. y r = xfy. Trajectories y f = — y/.v, y = c*/a by separation. Hyperbolas. 

35. v = yoe k \ e 4k = 0.9, k = \ In 0.9, e kt = 0.5, 
r = (In 0.5)/* = (In 0.5)/[(ln 0.9)/4] = 26.3 [days! 

37. e kt = 0.01, / = (In 0.01)/* = 175 [days] 

39. y f = — 4a/v. Trajectories y = c^a 174 or x — c 2 y 4 

41. Logistic ODE y' = Ay — By 2 , y = ]/u> it 4- Au = 4-5, w = C£~ At 4- 5/A 
43. A = amount of incident light. A thin layer of thickness Aa absorbs A A = — *AAa 
(— * = constant of proportionality). Thus AA/Aa = — *A. Let Aa-^ 0. Then 
A ; = — *A, A = Aotf - ^ = amount of light in a thick layer at depth a from the 
surface of incidence. 

Problem Set 2.1, page 52 

1. y = 2.5<? 4 * + O.SeT 4 * 3. y = e“* cos a 5. y = 4a 2 4- 7/a 2 

7. Yes 9. Yes if a* 0 11. No 

13. No 15. F(a, & z ) = 0 17. y = + c 2 

19. y dzldy = 4z, y = (c x a 4- c 2 ) -173 

21. (dzldy)z — — z 3 sin y, — 1/z = —dx/dy — cos y 4- a = —sin y 4- c L y 4- c 2 

23. y V = 2, y = § (r + l) 372 - i y(3) = y'(3) = 4 

25. y" = ky\ z = fe, z = = y\ Cj = 1, y = (**“ - 1)/* 


Problem Set 2.2, page 59 




1. v = 

c 1 e 7a ' + 

3. 

y = 

(c, + c 2 A-)e 25x 

5. y = 

c x e°- 9x + c 2 e~ llx 

7. 

y = 

<?°- 5x (/l cos 1.5a- + B sin 1.5a-) 

9* y = 

Cl e 35x + c 2 e~ 15x 

11. 

y = 

A cos 3m* 4- 5 sin 37 ta 

13. y = 

c x e 12x + c 2 e~ 12x 

15. 

~ n 

y ~ 

- 3y' + 2v = 0 

17. v" - 

- 2V3 v' + 3y = 0 

19. 

tt 

y ■ 

- 16y = 0 

21. y = 

4e Sx - 2e~ x 

23. 

3 7 = 

e~ 2x (2 cos a- — sin a) 

II 

c* 

2 + c~ vx 

27. 

y = 

(2 - 4x)e~°- 25x 

29. v = 

e~ 01x (3.2 cos 0.2a- + 1.6 sin 0.2a) 

31. 

y = 

a 

1 

1 

10 

U> 

V 

M* 

ii 

= y 2 = 0.00 le* + <?-- r 





35. Write E = e~ aJCl2 , c = cos car, 5: = sin car. Note that E ' = -^5, c' = 

= (oc . Substitute, drop 5, collect oterms, then .?-terms, and use a> 2 = b — \a 2 y 
to get c(& - \a 2 4- \a 2 - cu 2 ) 4- s(-a<o 4- 4- %a<o) = 0 + 0 = 0. 
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Problem Set 2.3, page 61 

1. 0, 0, -2 cos x 3. -0.8 a 3 + 6a 2 + 0.4, 0, e 0Ax 

5. — 1 2v 3 + 9.v 2 + 8x - 2, —28 sin 4x — 4 cos 4.v, 0 

7. v = (cj + c 2 x)e~ Zv 9. v = e~ 3x (A cos 2a + B sin 2a) 

11. y = f 1 £' _31v + c 2 e~ x 13. y = A cos 4.2 w.v + B sin 4. 2 tax 

Problem Set 2.4, page 68 

1. v = .Vo cos u> 0 t + (v 0 /ojq) sin co 0 t. At integer t (if a> 0 = tt), because of periodicity. 

3. niLd" = —mg sin 6 = —mgd (tangential component of W = mg), 0" + w 0 2 d = 0, 
w 0 /(2tt) = Vg/Z/(27T). 

5. No. because the frequency depends only on k/m. 

7. (i) Greater by a factor V3. (ii) Lower 

9. w* = [co 0 2 - c z /{4m 2 )] 112 = co 0 {\ - c 2 /(4mk)] lt2 = <o 0 (l - c 2 /Smk) = 2.9583 

11. 27j7w* since Eq. (10) and y' = 0 give tan («*/ — 8) = -a/co*; tan is periodic 
with period tt/co*. 

13. Case (II) of (5) with c = \/4mk = V4 • 500 • 4500 = 3000 [kg/secj, where 500 kg 
is the mass per wheel. 

15. y - [v 0 + (u 0 + cn-o)/]^ - " 1 , y = [1 + (u 0 + l)/]e _t ; (ii) u 0 = -2, -3/2, -4/3, 
-5/4, -6/5 

17. y = 0 gives = -c 2 e~ 2f3t , which has one or no positive zero, depending on the 
initial conditions. 


Problem Set 2.5, page 72 

1. c x .v 3 + c 2 a -2 

5. a'[A cos (In |.v|) -I- B sin (In |x|)] 

9. C'i-V 01 + c 2 a‘°' 9 

13. ,v _05 L2 cos (10 In |a|) - sin (10 In |a|)] 


3. (c^ + c* 2 In |a|)a 4 
7. CjA’ 1 ' 4 + c 2 x 16 
11 . 3 A' 2 - 2a- 3 
15.2a" 3 + 10 


Problem Set 2.6, page 77 

1. y" - 0.25y = 0, W = - 1 3. y" - 2k?' + k 2 y = 0, W = e 2kx 

5. a-V + 0.5Ay' + 0.0625y = 0, W = a- -05 7. a- 2 v" 4- ,vy' + 4y = 0, W = 2/a- 

9. x 2 y" - 0.75 y = 0, W = -2 11. y" - 6.25y = 0, W = 2.5 

13. y" + 2y' + 1.64v = 0, IV = 0.8e _2x 15. y" + 5y' + 6.34y = 0, W = 0.3e _5r 

17. y" + 7.6t7t' + 14.44^’ = 0, W = e~ T6l7X 


Problem Set 2.7, page 83 

1. + c 2 e~ 2x + 2.5e 2x 3. + c^- 4 * + lAxe 4 * - 4e x 

5. c^ 21 + c 2 e~ Zx — x 3 — 3a- — 0.5 

7. e~ 3x (A cos 8.v + B sin 8a) + <?*(cos 4.v + \ sin 4 a) 

9. Cl e-° Ax + c 2 e 0Ax + 20xe OAr - 20xe~ OAx 
11 . Ci cos 1.2a- + c 2 sin 1.2x + 10 a sin 1.2a- 

13. e~ Zv (A cos a + B sin a) + 5a- 2 - 8a + 4.4-1 .6 cos 2 a + 0.2 sin 2 a 
15. 4a sin 2 a 
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17 . <r oa ‘ r (1.5 cos 0.5a* - sin 0.5a) 4 2e 0 5x 
19 . 2e~ Zx 4 3e 4x - 12a 3 + 3a 2 - 6.5a 

Problem Set 2.8, page 90 

1 . —0.4 cos 3 / 4 7.2 sin 3 / 3 . -12.8 cos 4.5 / 4 3.6 sin 4.5/ 

5 . 0.16 cos 2/ 4 0.12 sin 2/ 7 . £= cos 3/ — ^ sin 3/ 

9 . c x e~ t/2 4 c 2 e~ 2tl2 - ^ cos / — § sin / 

11 . (c x 4 c 2 t)e~ ZU2 — § cos 3/ — sin 3/ 

13 . £“ 15t 04 cos / 4- B sin /) + 4 4- 0.8 cos 2r — 6.4 sin 2t 
15 . 0.32e“ l cos 5/ 4 0.68 cos 3/ 4- 0.24 sin 3/ 

17 . 5e~ 4t - 4e~ 2t - 0.3 cos 2/ 4 0.1 sin 2/ 

19 . £“ 1,5t (0.2 cos r — 1.1 sin /) 4 0.8 cos / 4- 0.4 sin t 

Problem Set 2.9, page 97 

1. Li' +/?/ = £,/= (E/R) 4- ce~ KtlL = 2.4 4- c<T 50t 
3 . R l' 4 IIC = 0 , f = ce~ tKRC) 

5 . / = 5(cos / — cos 10/)/99 

7 . 7 0 is maximum when 5 = 0; thus C = \l(a) 2 L). 

9. R > R cr it = 2 Vt/C is Case I, etc. 

11. 0 

13 . c^ -20 * 4 c 2 (?" 10t 4* 16.5 sin 10/ + 5.5 cos 10/ 

15 . E ' = — <?“ 4t (7.605 cos §/ 4 1.95 sin |/), / = <?“ 0,lt (/l cos |/ 4 £ sin |/) 

— e“ 4t cos |r 

17 . E{ 0) = 600, /'( 0) = 600, / = 7?" 3 ‘(-100 cos 4/ 4 75 sin 4/) 4 100 cos / 
19 . (b) R = 2 ft, L = 1 H, C = 1/12 F, E = 4.4 sin 10/ V 

Problem Set 2 . 10 , page 101 

1. A cos a 4 B sin a — a cos a 4 (sin a) In |sin a| 

3 . C\X 4 c 2 a 2 — a cos A 

5 . (cos a )(c 1 4 sin x - In |sec a 4 tan a|) 4 (sin a)(c 2 — cos a) 

= (c 1 - In |sec a 4 tan a|) cos a 4 c 2 sin x 

7 . (c x 4 |a) sin a 4 (c 2 4 In |cos a|) cos a 

9 . (c 1 4 c 2 x)e x 4 a 2 4 4a 4 6 - e x (ln |a| 4 1 ) 

11 . c x cos 2a 4 c 2 sin 2a 4 |a cosh 2a 

13 . c x a 4 c 2 x 2 — a sin a 

15 . A cos a 4 B sin a 4 y pl 4 y p2 , y pl as y p in Example 1, y p2 = ^ sin 5a 
17 . u” 4w = 0 by substitution of y = ux~ 112 . y t = a~ 1/2 cos a, y 2 = a~ 1/2 
sin a, y p = — |a 1/2 cos a 4 |a _1/2 sin a from (2) with the ODE in standard 
form. 

Chapter 2 Review Questions and Problems, page 102 

9. Ci<? 4,r 4 c 2 e~ 2x - 1.1 cos 6a - 0.3 sin 6a 
11 . e“ 4r (A cos 3a 4 B sin 3a) - § cos 3a 4 § sin 3a 
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13. }’! = .V 3 , V 2 = A- -4 , /• = A -5 , W = -7a -2 , Vp = ~42 X ~ 3 ~ 7 A_3 = ~6 X ~ 3 
15. y x = e x , y 2 = xe x , W = e 2 *, y p = e x /(2x) 

17. yj = e x cos a, v 2 = e x sin a, W = e 2x , y p = —xe x cos a + e*(sin a) In |sin a*| 

19. y = 4e 2x + 2e~ lx 21. y = 9a -4 + 6a 6 

23. y = e -2a ' - 2e~ Sx + 18a 2 - 30a +19 25. y = |a 3 + 4a 2 - 5a -2 

27. y = — 16 cos 2/ + 12 sin 2 1 + 16(cos 0.5 / — sin 1. 50- 
Resonance for co/(2tt) = 2/(2? r) = Mtt 
29. (o = 3.1 is close to co 0 = \/khn — 3, y = 25(cos 3 / — cos 3.1/). 

31. R = 9 CL, L = 0.5 H, C = 0.025 F, E = 17 sin 6/ V, hence 0.5/" + 9/' + 40/ 

= 102 cos 6 /, / = — 8.16e -8< + 7.5e -10t + 0.66 cos 6 / + 1.62 sin 6 / 

33. E' = 220 -314 cos 314/, / = e~ 50c (A cos 150/ + B sin 150/) + 0.847001 sin 314/ 
- 1.985219 cos 314/ 

Problem Set 3.1, page 111 

7. Linearly independent 
11. a|v| = .v 2 if x > 0, linearly dependent 
13. Linearly independent 
17. Linearly independent 

Problem Set 3.2, page 115 

1. /" - 6y" + 1 1/ - 6y =0 3. y* - y = 0 

5. y w 4- 4y" = 0 7. Ci 4 c *2 cos .v 4 c 3 sin a 

9. c x e x 4- (c 2 + c 3 x)e~ x 11. Cie x + c 2 e a *^ 7)X 4 c 3 e a ~ vl)x 

13. e°- 25x 4- 4.3e~ 07x 4 12.1 cos O.i v — 0.6 sin OAx 
15. 2.4 4 ^^(cos L5 a* - 2 sin 1.5 a) 

17. y = cosh 5a* - cos 4a* 

19. V = C X X ~ 2 4 c 2 a* 4- c 3 a* 2 . W = 12/a* 2 

Problem Set 3.3, page 122 

1. (ci 4- c 2 x)e 2x 4- c 3 e~^ x — 0.04e~ Sx 4- a 2 4- a* 4- 1 
3. Cj cos 2 -* 4* c 2 sin ^a 4- a(c 3 cos \x 4- c 4 sin \x) — \e~ x sin \x 
5. CjA* 0,5 4- c 2 a 4- c 3 a 1,5 4- 0.1a 5,5 

7. Ci cos a* 4- 6 2 sin x 4 c 3 cos 3 a 4 c 4 sin 3 a 4 0.2 cosh 2 a 
9. y = (4 — x 2 )e 3x — 0.5 cos 3 a* 4 0.5 sin 3 a 
11. a -2 — a 2 4 5a 4 4 A(in a 4 1) 

13. 3 4 9e^ x cos 9 a - (1.6 - L5x)e x 

Chapter 3 Review Questions and Problems, page 122 

7. Cj 4 c 2 a 1/2 4 c 3 a“ 1/2 9. eye- 0 ™ 4 c 2 e 0 5x 4 c 3 e~ 15x 

11. 6*iA* 2 ( 2 In a - |) 4 c 2 a 2 4 c 3 a 4 c 4 4 £a 7 
13. c r e~ x 4 e xl \c 2 cos (|V3.v) 4 c 3 sin (|V3 a)) 4 Se vl2 
15. (cj 4 c 2 x)e x 4 c 3 e~ x 4 0.25aV 17. -0.5a” 1 4 1.5a -5 

19. cos 7 a* 4 e 3x — 0.02 cosh x 


9. Linearly dependent 

15. Linearly independent 
19. Linearly dependent 
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All 


Problem Set 4.1, page 135 

1. Yes 5. yi = 0.02(-y 1 + .y 2 ), yi = 0.02( vi - 2y 2 + y 3 ), y' 3 = 0.02 (y 2 - )’3> 

7. c x = 1, c 2 = —5 
9. 3 and 0 

11. y[ = y 2 , y 2 = 4 y x , y x = c x e~ zt + c 2 e zt = y, y 2 = y[ 

13. y[ = y 2 , y 2 = y 2 , eigenvalues 0, l, y x = Ci + c 2 e\ y 2 = yi = y 

15. y' x = v 2 , y 2 = 0.109375)’! + 0.7 5y 2 (divide by 64), y x = Cje -01251 + c 2 e°- 875t 

Problem Set 4.3, page 146 

1. )'j = eye* 61 + c 2 e 6t , y 2 = ~2c 1 e~ 6t + 2c 2 e 6t 
3. y x = c x e zt + c 2 , y 2 = c x e Zt - c 2 

5. y x = Cie 4lt + c 2 e~ 4lt = (c x + c 2 ) cos 4r + i(c x — c 2 ) sin 4r 

= A cos 4t + B sin 4 1, y 2 = fCje 4 ' 4 — ic 2 e~ 4lt 

= (/ci — ic 2 ) cos 4 t + i(ic x 4- ic 2 ) sin 4t = B cos 4/ — A sin 4/, A = c x + c 2 , 

B = /(c, - c 2 ) 

7. )>i = 2c 1 + c 2 e~ 6t , y 2 = -c x + c 3 e~ 6t , y 3 = -c x + 2(c 2 + c 3 )e~ 6t 
9. >>j = c x e 18t + 2 c 2 e~ 0M 4- 2c 3 « -1 ’ 84 , y 2 = 2c 1 e 18t 4- c 2 e~ 09t — 2 c z e~ 16t , 
y 3 = 2c 1 e 18t - 2c 2 <? _ °' 9£ + c 3 e~ 1Bt 
11. y x = 10 + 6<? 24 , y 2 = — 5 + 3<? 2 ' 

13. )’! = 2.4e -t - 2e 2 S \ y 2 = l.&T* + 2e 2 51 

15. yi = 2e 14 - 5t 4 10, y 2 = 5e 1454 - 4 

17 . y 2 = y'i + yi, y'2 = y'i + yi = -yi - y 2 = ~yi - (yi + yi)> y'l + 2 yi + 2 y x = 0, 

yi = e l (A cos t + B sin /), y 2 = yi + y x = e *(B cos t — A sin /). Note that 

r 2 ~ y 2 + y 2 2 = e~ zt (A z 4 B 2 ). 

19. I x = 4 Cl e- 200t + c 2 e~ 50t , I 2 = -c^ -2004 - 4c 2 <r 504 


Problem Set 4.4, page 150 

1. Saddle point, unstable, )’x = c x e~ 4t + c 2 e 4t , y 2 = —2 c x e~ 4t + 2c 2 e 44 
3. Unstable node, y x = c x e l 4 c 2 e st , y 2 = — c x e l + c 2 e 3t 
5. Stable and attractive node, y x = c x e~ 3t + c 2 e~ 5t , y 2 = c x e~ 3t — c 2 e~ 5t 
7. Center, stable, y x = A cos 4t + B sin 4 1, y 2 = -2B cos 4r 4 2 A sin 4 1 
9. Saddle point, unstable, y x = c x e 3t + c 2 e~ t , y 2 = c x e 3t — c 2 e~ t 
11. y’x = y = c x e kt + c 2 e~ kt , y 2 = y\ hyperbolas k^y 2 — y 2 = const 
13. y = e~ zt (A cos t + B sin t), stable and attractive spirals 
17. For instance, (a) -2, (b) —1, (c) — §, (d) 1, (e) 4. 


Problem Set 4.5, page 158 

1. (0, 0), yi = y 2 , yi = 3 y x , saddle point; (0, - 1), y x = y x , y 2 = -14 y 2 , yi = -y 2 , 
= 3yi, center 

3. (0, 0), yi = 4y 2 , y 2 = 2 y x , saddle point; (2, 0), y x = 2 + y x , y 2 = y 2 , yi = 4y 2 , 
y 2 = —2y x , center 

5. (0, 0), yi = —y x + y 2 , y 2 = —y t - y 2 , stable and attractive spiral point; (-2, 2), 

J’i = “2 + yx, y 2 = 2 + y 2 , yi = -y x - 3y 2 , yi = -yj - y 2 , saddle point 
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7. y'i = ,V2> v 2 ~ ~Vi(l ~ 4.VJ), (0, 0), y[ = y 2 , y' 2 = -y x , center; 

(i °)> Vi = 4 + Ji> )’2 = ,V2. y'i = y'2 = (“4 ~ Vi)(— 4.Vi), y ' 2 = 5b. saddle 

9. (577 ± 2/itt, 0) saddle points; (— ± 2 hit. 0) centers. 

Use -cos (±577 + y x ) = sin (±y x ) = ±v x . 

11. 3b = y 2 , y '2 = “Ji(2 + 3 ’ i )(2 - >’1). (0, 0), = -4v!, center; (-2, 0), y' 2 = 8^, 

saddle point; (2, 0), y 2 = 83^ saddle point 
13. y"/y' + 2 y'ly = 0, In y' + 2 In y = c, y 'y 2 = v 2 .Vi 2 = const 
15. y = A cos t + B sin t, radius VA 2 + B 2 

Problem Set 4.6, page 162 

3. 3’j = A cos 4 1 + B sin 4/ + f§, y 2 = B cos At — A sin At — ft 
5. .Vi = c x e 4t + c 2 e -3t + 4, 3» 2 = c x e 4t — 2.5 c 2 e~ 3t — 10 
7. 3 , 1 = 2c x e~ 9t + c 2 e~ 4t - 90 1 + 28, 3-2 = c x e~ 9t + c 2 e~ 4t - 126 1 + 14 
9. 3’x = CjC* + 4c 2 e 2t - 3/ - 4 — 2e~\ y 2 = -c x e l - 5c 2 e 2t + 5/4- 7.5 + e~ t 
11. 3>! = 3 cos 2t - sin 2t + t + 1 , y 2 = cos 2/ + 3 sin 2/ + 2t — £ 

13. . Vl = Ae~ l - Ae l + e 2 \ y 2 = -Ae~ e + t 

15. 3*1 = 7 - 2e 2t + e 3t - Ae~ 3t , y 2 = -e 2t + 3<? _3t 

17. /( + 2.5(/ a - I 2 ) = 845 sin /, 2.5(4 - 4) + 25/ 2 = 0, 

/i = (95 + 1 62.5r) <r 5t - 95 cos t + 312.5 sin t, 
h = (-30 - 162.5/)<T 5t + 30 cos t + 12.5 sin t 
19. 1[ + 2(7! - / 2 ) = 200, 2(7 2 - I x ) + 8/ 2 + 2 f I 2 dt = 0. 

7i = 2c 1 e A,t + 2 c 2 e^ + 100, 

4 = (1,1 + VoaT)^ 1 * + (1.1 - V0Al)c 2 e x * 1 , = -0.9 + VoaT, 

A 2 = -0.9 - VoaT 


Chapter 4 Review Questions and Problems, page 163 

11. 3'! = c x e 8t + c 2 e~ 8t , y 2 = 2c!e 8t — 2 c z e~ 8t . Saddle point 
13. y x = c x e‘ + c 2 e~ 6t , y 2 = c x e l — 6 c 2 e~ 6t . Saddle point 
15. 3’! = c x e 73t + c 2 e~ 3t , y z = —c x e 18t + Q.15c 2 e~ 3t . Saddle point 
17. 3*]. = c x e 51 + c 2 e\ y 2 = c x e 5t — c 2 e l . Unstable node 
19. 3’i = e~ t (A cos 2/ + B sin 2f), 3'2 = e~\B cos 2 1 — A sin 2/). Stable and 
attractive spiral point 

21. 3*! = c x e l + c 2 e~ % + e 2t + e~ 2t , y 2 = —c 2 e~ l — 1.5e~ 2t 
23. y x = c x e l + c 2 e~ 2t — 6e~ l — 5, 3*2= —c^ — 2 c 2 e~ 2t + 10e -t + 6 
25. 3*i = c x e 3t + c 2 e~ l + t 2 — 2/ + 2, v 2 = c x e 3t — c 2 e~ l — t 2 + 2/ — 2 
27. A saddle point at (0, 0) 

29. /1 = 4e -40t - e~ 10t , l 2 = ~e~ 40t + 4e _10t 

31. (n7r, 0) center for even n and saddle point for odd n 

33. Saddle points at (0, 0) and (§, |), centers at (0. 3) and ( 2 , 0) 

Problem Set 5.1, page 170 
1. « 0 (l + a + |a 2 +•••) = a 0 e x 

3. a 0 {l - 2x 2 + fc 4 - + • • •) + a x (x - §a 3 + -ftx 5 - + • • •) 

= a 0 cos 2a + 2 «i sin 2a 
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5. a 0 (l 4 \x) 

7. a 0 -1- a 0 x 4 (%a 0 4 |)* 2 4- • 

• • = a 0 e x + e x — x — 1 = ce x — 

9. a 0 4 a^x 4 \a^pr 4 • • • = a 0 — a x 4 a x e x 

11. s = 1 - 4 a 4- 8 a 2 - ^a 3 + 

f a 4 - ^a 5 , s(0.2) = 0.69900 

13. S = \ + \x - ^A 3 + 4 §qA 5 , 

5(1) = 0.73125 

15. S = 1 + X — A 2 — §A 3 + |a 4 

t + Ilv5 = 923 

-r 24 X , 768 

Problem Set 5.2, page 176 


i-M 

3. 2 (as function of t = 

5.0 

7.2 

9. 1 

11. 7 r 

13- 2 5 , ) x s ; R=\ 

s=3 5 ( S ~ 2 ) 

15 y (s ~ 4f x ° R- 

1 “ (5-3)! *’*" 


X - 1, C = + 1 


17. a 0 (l - J2* 4 - (k* 5 -••■) + “l(x + 2 xZ + 6 A ' 3 + 2$* 4 - Zi xS - ■ ■ ■) 

19 . a 0 + a x (x - f a 3 + lx 5 - &x 7 + | jx 9 - ^a 11 + -•••) 

21 . ao(J — 2 X * ~ m -* 4 rao -* 6 + ■ • fl i(A — e -^ 3 — 24 a 5 loos a 7 ‘ ‘ 

23. a 0 (l + a 2 4- a 3 + a 4 + a 5 + a 6 + • • •) + a x x 


Problem Set 5.3, page 180 

3. P 6 ( x) = ^(231 a 6 - 315a 4 + 105a 2 - 5), 

P 7 (a) = ^(429a 7 - 693a 5 + 315a 3 - 35a) 

7. Set a = az. y = c^jx/a) + c 2 Q n (x /a) 

15. Pi 1 = Vl - a 2 , Pa 1 = 3aVi - a 2 , P 2 2 = 3(1 - a 2 ), 
P 4 2 = (1 - a 2 )(105a 2 - 15)/2 


Problem Set 5.4, page 187 

A 2 A 4 

1o ' 1=1+ 3! + l \ + '" = 

, x * , 1 4 „ , 144 36 A- 

3 - y '~'-u + m x - + 

1 X x^ 

5. r(r - l) + 4r + 2 = 0, r x = -1, /- 2 = -2; jj = — - — + - + 


sinh a 1 a a 3 

— ■ ,2_ 7 + 2r + 4t' + 

144 36 

a 2 


cosh A 

A 

25a 4 


1024 


1 1 A" 

? 2 = — “ — + 


+ - 


_ J_ A 2 

a 2 2 + 24 720 

7. Euler-Cauchy equation with t = a + 3, yj = (a + 3) 5 , y 2 = y x In (a + 3) 
9. b 0 = 1, c 0 = 0, r 2 = 0,yi = e~ x , y 2 = e~ x In a 
11. 3'i = 1 /(a + 1), y 2 = 1 /a 

13. b 0 = | ,c 0 = 0, r 2 = 0 ,y x = a 1/2 (1 + 2a + 2a 2 + §a 3 + • • •), 
y 2 = 1 4- 2a 4- 2a 2 4- • • • 

15. y x = (a - 4) 7 , y 2 = (a — 4 ) -5 (Euler-Cauchy with / = a — 4 ) 


17.y 1 =A + A 3 -^A 4 + ^A 5 -^A 6 +---,y 2 = 1 +3a 2 -|a 3 + |a 4 -P 
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19. 3’ = CjF(|, §, x) + c 2 VxF( 1, 1, §; x) 

21. y = /4(1 - Ax + §x 2 ) + BV*F(-| §; x) 

23. y = c x F(2, -2, -§; t - 2) + c 2 (/ - 2 ) 3/2 F(l, -J, |; / - 2) 

Problem Set S.5, page 197 

1. Use (7b) in Sec. 5.2. 

3. 0.77958 (exact 0.76520), 0.19674 (0.22389), -0.27651 (-0.26005), 

-0.39788 (-0.39715), -0.17038 (-0.17760), 0.15680 (0.15065), 0.30086 
(0.30008), 0.16833 (0.17165) 

5. y = <?!./„( A-v) + c 2 J-,X Ax), v # 0, ±1, • • • 

7. y = c x J,xVx) + c 2 J_„(Vx), v # 0, ±1, • • • 

9. y = c' 1 .v7 l (2.v), J x , linearly dependent 
11. y = x - "[c 1 7 1 ,(x) + c 2 J_„(a)], y # 0, ±1, • • • 

13. y = c x J p (x 2 ) + c 2 J_„( x 3 ), v # 0, ±1, • • • 

15. y = c-yVxJ x { 2Vx), 7 x , linearly dependent 
17. y = a- 1 ' 4 ./^* 1 ' 4 ), J x , 7_! linearly dependent 
19. y = a 2/5 (c 1 7 8/5 (4a- 1/4 ) + c 2 J_ 8/5 (4a- 1/4 )) 

21. Use (24b) with ^ = 0, (24a) with v = 1 , (24d) with y = 2, respectively. 

23. y n (Aj) = 7 n (jc 2 ) = 0 implies a 1 “’V„(a 1 ) = x 2 ~ n J n (x 2 ) = 0 and [a -, V„(a)]' = 0 
somewhere between x x and x 2 by Rolle’s theorem. Now use (24b) to get 
7 „+i(a) = 0 there. Conversely, J n+ ,(.v 3 ) = 7 n+1 (A- 4 ) = 0, thus 
A' 3 n l 7, l+ |(A 3 ) = A 4 n+1 ./ n+1 (A 4 ) = 0 implies J n (x) = 0 in between by Rolle’s 
theorem and (24a) with v = n + 1. 

25. Integrate the formulas in (24). 

27. Use (24a) with v = 1 , partial integration, (24b) with v = 0, partial integration. 

33. CAS Experiment (b) a 0 = 1 , a - j = 2.5, x 2 = 20, approximately. It increases with n. 
(c) (14) is exact, (d) It oscillates, (e) Formula (24b) with v — 0 

Problem Set 5.6, page 202 

1. y = c x J 5 (x) + c 2 Y 5 (x ) 3. y = c x J 0 (Vx) 4- c 2 y 0 (Vx) 

5. y = c 1 J 2 (a 2 ) + c 2 Y 2 (x 2 ) 7. y = a-- 5 ( Ci 7 5 (a) + c 2 y 5 (A)) 

9. y = x 3 (c 1 7 3 (a 3 ) -I- c 2 y 3 (A- 3 )) 11. Set H a) = kH (2 \ use (10). 

13. Set a = is in (1), Sec. 5.5, to get the present ODE (12) in terms of s. Use (20), 

Sec. 5.5. 

Problem Set 5.7, page 209 

3. Set x = ct + k. S.x = cos 0, dx = -sin 6 dd, etc. 

7. A. m = (ititt/ 5) 2 , m = 1, 2, • • • ; y m = sin (imrx/5) 

9. X m — [(2m + 1)tt/2L] 2 . m = 0, 1, • • • ; y m (x) = sin [(2m + 1)ttxJ2L] 

11. A to = m 2 , m = 0, 1, • • • ; y 0 = 1, y m = cos nix, sin mx, m = 1, 2, • • • 

13. k = k m from tan k = -k. A m = k m 2 , m = 1, 2, • • • ; y,„ = sin k m x 

15. ^ = m 2 m = 1, 2, • • • ; y m = x sin (m In |x|) 

17. p = e 8x , q = 0, r = e 8x , A m = m 2 ; y m = e~ 4x sin mx, m = 1, 2, • • • 

19. A m - (imr) 2 , y m = x cos mirx, x sin mirx, m = 0, 1, • • • 
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Problem Set 5.8, page 216 

1. i -6P 4 (a) ~ 0.6P 0 (a) 3. §P 3 (a) - f P 2 (x) + fPxW - |P 0 W 

7. -OAllSP^x) - 0.6908P 3 (a) + I.844P 5 (a) - 0.8234P 7 (.y) + 0.1544P 9 (a) + • • • , 
m 0 = 9. Rounding seems to have considerable influence in Probs. 6-15. 

9. 0.3799P 2 (a) + 1.673P 4 (a) - 1.397P 6 (a) + 0.3968P 8 (a) + • • • . m 0 = 8 

11. 1.175P 0 (a) + 1.104P 1 (a) + 0.3575P 2 (a) + 0.0700P 3 (a) , m 0 = 3 or 4 

13. 0.7855P o (a) - 0.3550P 2 (a) + 0.0900P 4 (a) , w 0 = 4 

15. 0.1212P o (a) - 0.7955P 2 (a) + 0.9600P 4 (a) - 0.3360P 6 (a) + • • • , m 0 = 8 
17. (c) a m = (2/yi 2 (a 0j „ l ))(yi(a 0 ,,„)/a;o. ni ) = 2/(a 0m J i(ot 0m )) 

Chapter 5 Review Questions and Problems, page 217 

11. e 3x , e~ 3x , or cosh 3a, sinh 3a 13. e x , 1 + a 

15. e - * 2 , a*? - * 2 17. «-*, e~ x In a 

19. 1/(1 - a 2 ), a/(1 - a 2 ) or 1/(1 - a), 1/(1 + a) 

21. y = c,7 v -(6a) + c 2 J_ v -(6a) 23. 3> = c a y x (A 2 ) + c 2 Yi(x 2 ) 

25. v = \Tx[ Cl J w (\kx 2 ) + c 2 J_ m (\kx 2 )} 

27. A, n = (2 imr) 2 , y 0 = I , _y„, = cos 2mvx, sin 2mirx, in = 1, 2, • • • 

29. .v = CxJi(kx) + c 2 Yi(kx), c 2 = 0, y( 1 ) = CiJ x (k) = 0, k = k m = a X m (the positive 
zeros of J x ), y m = Ji(a litn x) 

31. 1.813P 0 (a) + 2.923P x (a) + 1.759P 2 (a) + 0.663P 3 (a) + 0.185P 4 (a) + • ■ • 

33. 0.693P o (a) - 0.285P 2 (a) + 0.144P 4 (a) - 0.091P 6 (a) + • • • 

35. 0.25P 0 (a) + O.SP^x) + 0.3125P 2 (a) - 0.0938P 4 (a) + 0.0508P 6 (a) + • • • 


Problem Set 6.1, page 226 


7. 


2 _ _ 2 _ 
a 3 a 2 
s cos $ — a) sin 6 


+ of 


? ~b$\ 
(I - e~ s ) 2 


13. - (1 - e ~ bs ) 
s 


19. 


3. 

s 

5. 


A - 2 

S 2 + 4-7T 2 

(•v 

- 2) 2 - 1 

9. 

e 3a 

11. 


1 

s + 2b 

l 2 

+ 4 

15. 

1 - (1 + 2a)<T 2s 

17. 

1 

- e ~ bs 

2s 2 


A 2 


be 


,-bs 


21. 2(Ji) = #0) ~ 2(f) = - - - (1 - e- 2 *) = e-^/s 

s s 


23. Set cl = p. Then 2(f(ct)) = f e~ sl f(ct) dt = f e~ <s/c)p f(p) dp/c = F(s/c)/c. 

-'A 


29. 4 cos nt — 3 sin TTt 

35. 2 - 2e~ 4t 

39. -4= sin V5/ - <T 5t 
V5 

«(a + A) + £ 

45> (a + k) 2 + 1 
51. 3tf -2t sin 5/ 


31. 1 — |/ 2 + |/ 4 33. sin — — 

Lj 

37. (e^ 1 - e _v15t )/(V3 + V5) 

41 3 ' 8 43 ^ 

(a - 2.4) 2 (a + a) 2 + co 2 

47. 3.5 /V 49. V2r 2 e~ tV2 


53. e 5 " ! sinh irt 
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Problem Set 6.2, page 232 

L _J 3 . ^ 5. 

( s — k) 2 s(s 2 + Ao) 2 ) s(s 2 — 4a 2 ) 

7TS 

7 ‘ (s 2 + Wf 

9. Use shifting. Use cos 2 a = % + § cos 2a; use cos 2 a + sin 2 a = 1 . 

Ans. ( 2s 2 + l)/[2s(s 2 + 1)] 

11. (s + |)K = - 1 + 17 • 2/(s 2 + 4 ), y = le~ m + 2 sin 2t - 8 cos 2 1 
13. (s 2 - })Y = 4.9, y = 4 cosh \t 

15. (.v 2 + 2s + 2)7 = .9 - 3 + 2 • 1, Y = (s + 1 - 2 )/[(s + I) 2 + 1], 
y = e _t (cos / — 2 sin /) 

17. (s 2 + 7s + 12)7 = 3.5 5 - 10 + 24.5 + 21/(5 - 3), y = \e 2t + |e _4t + |e _3t 
19. (5 + 1.5) 2 y = 5 + 31.5 + 3 + 54/5 4 + 64/5, 

y = 1/(5 + 1.5) + 1/(5 + I.5) 2 + 24/5 4 - 32/s 3 + 32 Is 2 , 

y = (1 + t)e~ l - 5t + At 3 - 16f 2 + 32/ 

21. / = t + 2, f = 4/(5 - 6), y = 4e 6t , v = 4<? 6<t_2) 

23. t = t + 1, (5 - 1)(5 + 4)F = 45 + 17 + 6/(5 - 2), y = 3e t_1 + e 2<t_l) 

25. (b) In the proof, integrate from 0 to a and then from a to cc and see what happens, 
(c) Find ££(/) and ££(/') by integration and substitute them into (1*). 

27. 2 - 2e~ 112 29. ( e M - 1) - j 31. cosh V5t - I 

33. g sinh 2/ - \l 


Problem Set 6.3, page 240 

3 . (1 - e 2 ~ 2s )/(s - 1 ) 


(? + 7 * l) e '* _ (? + 7 + l) 

, 2 + a- 2 ( “ e ~‘ “ ‘ 4,> 9 ' (l 7 + 7) 


11. g (e -3s + e- es ) 13. (e- 2 ** 2 ” - e ~ 4s+ ^) 

S + 7T S ~ 7T 

15. 0 if / < 4, / - 4 if t > 4 17. sin / if 27r < / < 87T, 0 elsewhere 

19. 0 if t < 2, (/ - 2) 4 /24 if / > 2 21. «(/ - 3) cosh (2/ - 6) 

23. e -t sin / 25. <? -2t cos 3/ + 9 cos 2/ + 8 sin 2/ 

27. sin 3/ + sin / if 0 < f < tt and | sin 3 1 if t > it 

29. / - sin t if 0 < / < 1, cos (/ — 1) + sin (/ - 1) - sin / if / > 1 

31. e % — sin / + u(t — 27r)(sin / — | sin 2 /) 

33. / = 1 + t, y" + 4y = 8(1 + /) 2 (1 - u(f - 4)), cos 2/ + 2/ 2 - 1 if / < 5, 

cos 2 1 + 49 cos (2/ - 10) + 10 sin (2/ - 10) if / > 5 
35. Rq' + q/C = 0, Q = %{q), q( 0) = CV 0 , i = q'(t), R(sQ - CV 0 ) + QIC = 0, 

q = CV 0 e~ tmC) 


„~2s 1 _ _-2s 


5 + 10 


1 - g-KXt-2) jf f > 2 


, / = 0 if / < 2 and 
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39. i = e~ 20t + 20f - 1 + u(t - 2)[-20 1 + 1 + 39<? _20(t_2) ] 

41. 0.1/' + 25 i = 490e -5t [l - «(/ - 1)], / = 20(<r 54 - e~ 250t ) + 20u(t - 1)[-<T 5£ 

250i 4*245 j 

43. / = (10 sin 10/ + 100 sin f)(u(f — tt) — w(f — 377)) 

45. i + 2/ + 2 /(t) dr = 1 - u(t - 2), / = (1 - e^Hs 2 + 2s + 2), 

J o 

/ = e -£ sin t — u{t — 2) <? -£+2 sin (/ — 2) 

47. / = 27 cos / + 6 sin t — e -l (27 cos 3/ + 1 1 sin 3/) 

+ «(/ — 2tt) [—27 cos t — 6 sin t + e~ tt ~ 2w> (27 cos 3/ + 1 1 sin 3/)] 


Problem Set 6.4, page 247 

1. y = 10 cos / if 0 < t < 2 tt and 10 cos / + sin t if t > 2ir 

3. y = 5.56* + 4.5e _t + 5(e t_1/2 - e~ t+1/2 )u(t - f ) - 50(<?* _1 - e~ t+1 )u{t - 1) 

5. y = 0.1 [«* + e -2£ (— cos t + 7 sin /)] 

+ 0.1 u(l - 10)[— e* + <T 2t+30 (cos (/ - 10) - 7 sin (/ - 10))] 

7. y = 1 + \e~ l sin 3/ + w(/ — 4) [ — I 4- e -t+4 (cos (3 1 — 12) + | sin (3 1 — 12))] 

- ^u(t - 5)e~ t+s sin (3 1 - 15) 

9. y = 5t — 2 — 50m(/ — Tr)e“ l+W sin 2 f. Straight line, sharply deformed between 7 r 
and about 8 

11. y = (0.4/ + 1.52)6* + 0.48<? -4 * + 1.6 «(/ - 2)[-e* + e - 4t+1 °] 


Problem Set 6.5, page 253 
1. / 

1 


3. e l - t - l 


7. XT («** - e" kt ) = T sinh ft/ 
2ft ft 


5. — sin w/ 

O) 

9. §(e 3£ - e -5£ ) 11. | cos 2f) = \ sin 2 i 

13. r — sin / 15. g(cosh 3/ — 1) 

19. Y = 3/((s 2 + 4)(s 2 + 9)), y = 0.3 sin 2/ - 0.2 sin 3/ 

21. (s 2 + 9)Y = 4 + 8(1 + 6 -,,s )/(s 2 + 1), y = sin / + sin 3/ if / < tt, § sin 3/ if / > tt 
23. 0 if 0 < / < 1, | f sin (2 (t - 1)) dr = -f cos (2/ - 2) + f if t > 1 

•'l 

25. y = 2e~ 2t - e~ 4t + (e" 2£+2 - e~ 4i+4 )u(t - 1) + (e~ 2t+4 - e~ 4t+8 )u(t - 2) 


27. y - 1 * y = 1, y = e* 

31. /(I + 1/s 2 ) = 1/s, y = cos / 


29. y - y * sin / = cos t, Y = 1/s, y = 1 
33. /(I + 2/(s - 1)) = (s - I)" 2 , y = sinh / 


Problem Set 6.6, page 257 

, 

(*-l) 2 
„ 2s + 4 
(s 2 + 4s + 5) 2 
2w(3s 2 — <w 2 ) 

(s 2 + <w 2 ) 3 


, 2 (os 

3 ‘ (s 2 + <u 2 ) 2 
„ 24s 2 + 128 
’ (s 2 - I6) 3 
2s cos ft + (s 2 - 1) sin ft 


(s 2 + l) 2 
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13. 6 te 1 15. te 2t sin t 

17. t 2 e kt 19. In s - In (s - 1); (e‘ - l)/t 

Problem Set 6.7, page 262 

1. .Vi = ~e~ l sin t, y 2 = e_t cos t 3. = 2e _4t — 4e 2t , y 2 = e~ 4t — 8e 2t 

5. 3 ! x = 2e~ ( + 4e -2t + \t — y 2 = —3e~ l — 4e~ 2t — 2 t + 4 

7. _v t = e~\2 cos 2f + 6 sin 2 t) + t 2 , y 2 = I0e~ l sin 2/ — t 2 

9. v x = 4 cos 5f + 6 sin 5/ — 2 cos r — 25 sin t, y 2 = 2 cos 5 t — 10 sin 5l + 20 sin t 

11 . .>>! = —cos t + sin t + 1 + 11(1 — 1)[— 1 + cos (r — 1) — sin (/ - I)] 

y 2 = cos / + sin / — 1 + u(t — 1)[1 — cos (t — 1) — sin (t — 1)] 

13. y x = 2 u(t - 2)(e 4t - e t+6 ), y 2 = e 2t + u(t - 2)(e 4t - 3e zt+4 + 2e t+6 ) 

15. 31 = —e~ 2t + e c + §«(/ — l)(—e~ 2t+2 + e l ), y 2 = -e~ 2t + 4e t 
+ §«(/ - 1)(— e -2t+3 + e l ) 

17. 3>! = 3 sin 2 1 + 8e -3t , y 2 = — 3 sin 2f + 5e _3t 
19. 3»i = c‘ - e~\ y 2 = e\ y 3 = e~ x 

25. 4 ii + 8(/ 1 — i 2 ) + 2 i'i = 390 cos t, 8 i 2 + 8(/ 2 — i\) -I- 4/ 2 = 0, i\ = — 26e -2t 

— I6e~ 8t + 42 cos / + 15 sin t, i 2 = -26e~ zt + %e~ 8t + 18 cos t + 12 sin t 


Chapter 6 Review Questions and Problems, page 267 



17 10 21 

( s - 1)(j 2 + 4) s 4 - 1 " (a- - a)(s - b ) 


23. 10 cos rV 2 25. 3e~ 2 ' sin 4/ 27. «(/ - 2)(5 + 4(r - 2)) 

29. te~ 2t sin / 31. (r 2 - !)«(/ - 1) 33. (cot - sin cot) 

co 

35. 20 sin / + u(t — 1)[1 — cos (/ — 1)] 

37. 10 cos 2 1 — \ sin 2 1 + 4 «(/ — 5) sin (2/ — 10) 39. e~\l cos 3f + 2 sin 3r) 

41. e~ x + u(t - tr)[\.2 cos t - 3.6 sin t + 2e~ t+ir - 0.8e 2t " 2 ^ 

43. «(/ - 1)(/ - \)e 2t ~ 2 + 4 u(r - 2)(2 - t)e 2l ~ 4 

45. = e x + £e~ x — § cos t — \ sin t, y 2 = —e t + £ e~ x + § cos t + § sin t 

47. 3’x = \e~ x sin 2 t, y 2 = e“‘(cos 2t — \ sin 2t ) 

49. 3'j = e 2t , y 2 = e 2t + e l 

51. / = (1 — « _2 s )/[s(j + 10)], / = 0.1(1 - e~ 10t ) + 0.1 «(r - 2)£— 1 + <r 10t+2 °] 
53. 1 = e~ 2t (16 cos 4/ — 42 sin 4/) — 76 cos 20r + 16 sin 20f 
55. i[ + 1 0(/ x - i 2 ) = 100 t 2 , 30/2 + 10(/ 2 - /() + 100/ 2 = 0, 
h = (I + 4t)e~ 5t + 10 1 2 - % i 2 = (f + 2t)e~ 5t + 2t - f 


Problem Set 7.1, page 277 
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'-6 r 


" 6 - 1 " 


3. Undef., 

6 -3 

* 

-6 3 

, undef. 


.9 0_ 


1 

o 

os 

1 

i 




"-48 

-2" 


' 36 

0 

48' 


'—0.3 

-5.0 

-3.4" 

5. 

38 

-44 


-12 

24 

24 

, same, 

-4.9 

1.8 

3.8 


. 67 

-15. 


. 72 

60 

— 48_ 


_— 3.6 

3.5 

0.4_ 



" 66" 


‘ 0 ' 


" 6.5" 

7. 

0 

» 

3.2 

, same, 

-0.8 


_-33_ 


_— 4.2 _ 


.-22. 


9. — 5* 2 = — 3 

-5*j + 2*2 = 4 

—3*i + 4*2 = 0 


Problem Set 7.2, page 286 



' 24" 


' 2" 


' 54 

10 

-46' 

1 . 

49 

, undef., 

38 

• 

74 

19 

-29 


.-43. 


.-22. 


.-74 

-5 

51. 



‘ 54 

10 

-46" 


" 134 

-50 

-18" 


' 44 

64 

-72" 

3. 

74 

19 

-29 

? 

94 

-29 

-1 


64 

110 

-114 


.-74 

-5 

51. 


.-134 

63 

19. 


.—12 

-114 

126. 


' 236 -92 -12" 
-92 38 6 

.-12 6 6 . 


5. [20 -3 


-7], [-62 


34 2], 


"565“ 

525 

_790_ 


same 



"15 

0 

40" 


'-310 

170 

10" 


7. 

3 

0 

8 

.31. 

-62 

34 

2 

, same 


. 6 

0 

16. 


.-124 

68 

4j 



I" 337 


8 - 

160“ 

1 

' 257 

68 

-188“ 

9. 

252 


49 • 

-68 


same, 

232 

97 

-96 


.-308 


52 

233. 

1 

.-248 

-16 

265. 
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' 324 

32 

-320“ 


‘ 216 

-104 

-104" 


" 7060 

960 

-5120“ 

11. 

244 

38 

-322 

> 

280 

-132 

-68 

> 

7548 

1246 

-5434 


.-244 

-10 

366. 


.-280 

140 

76. 


.-8140 

-1090 

6150. 


“ 4324 

1520 

-4816“ 

3636 

1242 

-4518 

.-3700 

-1046 

5002. 


13. 83, 166, 593, 0 


19. (d) AB = (AB) t = B t A t = BA; etc. (e) Arts. If AB = -BA. 
21. Triangular are U! + U 2 , U 1 U 2 , U/, L x -I- L 2 , L X L 2 , L x 2 . 

23. [0.8 1.2] t , [0.76 1.24] T , [0.752 I.248] T 


27. p = [110 45 801 T , v = [92000 

86300] T 

Problem Set 7.3, page 295 

1. X = 2.5, y = —4.2 

3. x = < 

5. x = 0, y = —2, z = 9 

7. x = - 

9. jc = 3y + 2, y arb., z = -y + 6 

11. v = 

13. w = 1 , y = 2z — a*, a*, z arb. 

15. w = 


17. Ii — (Ri + /? 2 )E 0 /(^i^2)^ ^2 — Eq/Ri, / 3 — E 0 /R 2 [Amps] 

19. I x - I 2 - / 3 = 0, (3 + 2 + 5)1 1 + 10/ 2 = 95 + 35, 10/ 2 - 5/ 3 = 35, f x = 8, 

/ 2 = 5, / 3 = 3 Amps 

21. x x -r x 4 = 500, x x -f a * 2 = 800, a 2 + a 3 = 1 100, x 3 + a 4 = 800, x x = 500 — a* 4 , 

a 2 = 300 + x 4 , x 3 = 800 - a* 4 , x 4 arbitrary 


Problem Set 7.4, page 301 
1.1, [1 —2]; [1 0 -3] T 


3. 3, [1 

4 

0 7], [0 

-2 1 31, [0 0 

5 

105]; [-2 4 5] t , [0 1 5] T , 

[0 0 

1] 

T 





5. 2, [3 

0 

5], [0 3 

41; [3 0 5] t , [0 

3 

4] T 


7. 2, [8 

0 

41, [0 2 

0J; [8 0 4 0] T , 

[0 

2 0 4] T 

9.3, [1 

0 

3 0], [0 

5 8 -37], [0 0 

-74 

296]; same transposed 

11. 4, [1 

0 

0 0], [0 

1 0 0], [0 0 

I 

0], [0 

0 0 1]; same transposed 

13. No 



15. No 



17. Yes 

19. Yes 



21. (c) 1 



27.2, [1 -1 0], [0 0 

29. No 
35. 1, [5 

5 

2 

1 S 1] 

31. 1, [-4 \ 

1] 


33. No 


Problem Set 7.7, page 314 

5. 107 7. cos (a + /3) 


9. -66.88 

11.0 

13. u 3 + v 3 + w 3 — 3 uvw 

15.4 

19. x = —1.2, y = 0.8, s = 3.1 

21. 1 

23. 3 
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Problem Set 7.8, page 322 

‘ 1.80 -2.32“ 

1 . • 
0.25 0.60_ 


cos 26 —sin 26' 
.sin 26 cos 2 6. 


7.-2 1 0 

.3-4 1. 


9. A -1 = A 


4 -1 -5 


5. 5 10 

_0 1 3 


11. No inverse 


15. (A z )" 1 = (A =15 1-5 

Jl 5 4 9j 

19. AA -1 = I, (AA -1 ) -1 = (A - 1 ) - 1 A -1 = I. Multiply by A from the right. 
21. det A — — 1 . C 12 — C 2 j = C 33 = — 1 , the other Cj k are zero. 

23. det A = 1. C u = 1, C x2 = —2, C 22 = 1, C 13 = 3, C 2 3 = —4, C 33 = 1 


Problem Set 7.9, page 329 

1. Yes, 2, [3 5 0] T , [2 0 -5] T 3. No 

5. Yes, 2, [0 0 0 1 0] T , [0 0 0 0 if 


" 0 1 “ 

7. Yes, 1, 

.-1 0 . 


11. Yes, 2, xe~ x , e~ x 

13. [1 Of, [0 if; [1 If, [-1 If; [1 Of , [0 -if 
15. a'i = — 0.6yi 4- 0.4y 2 17. X\ = 2y\ + y 2 

x 2 = -0.8.V! + 0.2y 2 x 2 = 5y x + 3y 2 

19. x 1 = 5y x + 3y 2 - 3 .v 3 
x 2 = 3y x + 2.y 2 - 2y 3 
A3 = 2.y x - y 2 + 2v 3 

21. V56 23. 16 V5 

25. 2 29. 4vi - 3v 2 = 0, v = ± [§ |] T 

Chapter 7 Review Questions and Problems, page 330 


11. X = 4, y = 7 

15. x = \,y = z = f 

19. x = 2 z, y = 4, z arbitrary 

23. 638, 0, 0 

“12 0 6 

27. 14, 14, 28 0 14 


13. x = y + 6 , z = y, y arbitrary 
17. a = 7, y = -3 
21.0 

“ 8.0 -3.6 1.2" 

25. -3.6 2.6 2.4 

. 1.2 2.4 9.0. 


29. [-20 9 -3], 
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31. 2, 2 33. 2. 2 


35. 2, 2 



37 

51 

■ 4 
5 

a 



"-5 

10 

5" 


‘ 72 

-72 

132' 

39. 45 

11 

5 

-2 

41 — 

12 

31 

-32 

59 


. 23 

-10 

4. 


19 

20 

-35. 


43. = 33 A, / 2 = 1 1 A, / 3 = 22 A 

45. = 12 A, / 2 = 18 A, 7 3 = 6 A 


Problem Set 8.1, page 338 

1. -2, [1 0] T ; 0.4, [0 1] T 

3. 4, 2x x + (-4 - 4)x 2 = 0, say, jtj. = 4, x 2 = 1; —4, [0 1] T 

5. -4, [2 9] t ; 3, [1 1] T 7. 0.8 + 0.6/, [1 -if; 0.8 - 0.6/, [1 /f 

9.5, [1 2] T ;0, [-2 1] T 11. 4, [1 0 0] T ; 0, [0 1 0] T ; -1, [0 0 if 

13. -(A 3 - 18A 2 + 99A - L62)/(A - 3) = -(A 2 - 15A + 54); 3, [2 -2 1] T ; 

6, [1 2 2] t ; 9, [2 1 -2] T 

15. 1, [-3 2 10] T ; 4, [0 1 2] T ; 2, [0 0 1] T 

17. -(A 3 - 7A 2 - 5A + 75)/(A + 3) = -(A 2 - 10A + 25); -3, [1 2 -if; 

5, [3 0 1] T , [-2 1 Of 

19. -(A - 9) 3 ; 9, [2 -2 if; defect 2 

21. A(A 3 — 8A 2 — 16A + 128)/(A - 4) = A(A 2 — 4A — 32); 4, [— 1 3 1 if ; 

-4, [1 1 -1 —If; 0, [1 1 1 If; 8, [1 -3 1 -3f 

23.2, [8 8 -16 If; 1, [0 7 0 4f ; 3, [0 0 9 2f , -6, [0 0 0 If 

25. (A + 1) 2 (A 2 + 2A- 15); -1, [1 0 0 Of , [0 1 0 Of; 

-5, [-3 -3 1 If, 3, [3 -3 1 -If 

29. Use that real entries imply real coefficients of the characteristic polynomial. 


Problem Set 8.2, page 343 


1 . 


"-1 O' 


"1" 


"0" 


;-i. 


; l. 


. 0 1 . 


. 0 . 


_ 1 _ 


; any point ( x , 0) on the A-axis is mapped onto 


( — a, 0), so that [1 Of is an eigenvector corresponding to A = — 1. 


3. (a, y) maps onto (a, 0). 


'1 

O' 


T 


"0" 



; 1. 


; 0, 


.0 

0. 


.0. 


_l_ 


. A point on the A-axis maps 


onto itself, a point on the y-axis maps onto the origin. 

5. (a, y) maps onto (5a, 5 >•). 2X2 diagonal matrix with entries 5. 

7. -2, [J - If -45°; 8, [1 if, 45° 

9.2, [3 -If, -18.4°; 7, [1 3f, 71.6° 

11 . 1 , [-1/V6 l], 112.2°; 8, [l 1/V6], 22.2° 

13.1,11 If, 45°; -5, [I -If, -45° 

15. c[15 24 50f , c > 0 

17. x = (I - A) -I y = [0.73 0.59 1.04f (rounded) 

19. [1 I If 21. 1.8 


23. 2.1 
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Problem Set 8.3, page 348 

3. No S. A -1 = (— A T ) -1 = — (A -1 ) t 

7. No since det A = det (A T ) = det (-A) = (- 1) 3 det A = — det A = 0. 
9. Orthogonal, 0.96 ± 0.28/ 11. Neither, 2, 2, defect 1 

13. Symmetric, 9, 18, 18 15. Orthogonal, 1, /, — / 

17. Symmetric, a + 2b, a — b, a - b 


Problem Set 8.4, page 35S 


1. [1 2f, [2 —if; X = 

"1 

2" 

,D = 

1 

O 

1 


.2 

-1. 


L0 2J 


"4 

0" 



3. [1 -If, [1 if, D = 

.0 

6. 




5. [2 -If, [2 If, diag (-2, 4) 

7. [1 0 Of, [1 -2 If, [0 1 Of , diag (1, 2, 3) 

9. [0 3 2f,[5 3 Of, [1 0 2f, diag (45, 9, -27) 




'-95 

18 

-144' 


' 4' 


" 6' 


' 3" 

17. 

24 

-2 

36 

; 4, 

-2 

; -2, 

-1 

; 1, 

0 


. 66 

-12 

100. 


.-3. 


4_ 


2_ 



'-2' 


' 0" 


' 0" 

X = 

-4 


-2 


0 


.-6. 


.-4. 


2_ 


19. C = 


21. C = 


23. C = 


25. C = 


27. C = 


" 1 121 [ 0.8 0 . 6 " 

, lOyf - 15y 2 2 = 5, x = y, hyperbola 

L 12 -6j L0.6 -0.8 J 

:;]■ 

]■ 


[ 2/V5 

l/Vs' 

, 5yf - 5y 2 2 = 0, x = 

J -1/V5 

2/V5_ 


4 V3 
V3 2 
1 -6 
-6 
12 

L16 


yf + 5 y 2 2 = 10, x = 
7)'i 2 - 5j’ 2 2 = 35, x = 

4v 2 2 = 1 12, x = 


:]■ 

161 

. 28 yf - 

12 ] 


1/2 V3/2 

-V3/2 1/2 

1/V2 1/V2' 

|_— 1/V2 1/V2_ 

“l/V2 1/V2 

J/V2 -I/V2J 


y, straight lines 
y, ellipse 
y, hyperbola 


y, hyperbola 
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Problem Set 8.5, page 361 

3. (ABC) t = C t B t A t = C _1 (~B)A 

5. Hermitian. 3 + V2. [— t I — V2] T ; 3 — V2, [-/' 1 + V2 ] T 
7. Hermitian, unitary, 1 , [l / — /V2] T ; - 1 , [l i + /V2] T 
9. Skew-Hermitian, 5/, [1 0 0] T , [0 1 1] T ; -5/, [0 I -1] T 

11. Skew-Hermitian, unitary, /, [1 0 1J T , [0 I 01 T ; — /, [1 0 — 1| T 

13. Skew-Hermitian, —66/ 15. Hermitian, 10 

Chapter 8 Review Questions and Problems, page 362 




Problem Set 9.1, page 370 


1. 2, -4, 0; V20; [1/V5, -2/V5, 0] 
5. -8, -6, 0; 10; [-0.8, -0.6, 0] 

9. (i §, |); V37/8 

13. [4, -2, 0], [-2, 1,0], [-1,|, 0] 
17. [28, -14,-14] 

23. (5.5, 5.5, 0), (§, i f ) 

27. [-8, -2, 41; V84 


3. -1, 0, 5; V26; [-1/V26, 0. 5A/26] 
7. (7. 5, 0); VTO 
11. (0, 1,|); V37/2 
15. [10, -5, -15] 

19. [-2, 1, 8], [6, -3, -24] 

25. [0, 0, 9]; 9 
29. v = [0, 0, -9] 


31. [-9, 0, 0], [0, -2, 0], [0, 0, -11]. Yes. 33. |p + q + u| 


35. 


25 25 "| f 20 5 

V2 ’ V2 J L V2 ’ V2 J L V2 ‘ V2_ 


^ 6. Nothing 

37. |w|/(2 sin a) 


Problem Set 9.2, page 376 

1. 4 3. V24l 

5. [12, -8, 4], [-18, -9, -36] 7. 17 

9. -4,4 11.-24 15. Use (1) and |cos y| ^ 1. 

17. |a + b| 2 + |a - b| 2 = a*a + 2a*b -I- b*b + (a*a - 2a*b + b»b) = 2|a| 2 + 2|b| 2 
^9- 0 21. 1 5 23. Orthogonality. Yes 
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25. 2, 2, 0,-2 27.79.11° 

31. 54.74° 33. 54.79°, 79.1 1°, 46. 10° 

37.3 39.1.4 

41. If |a| = |b| or if a and b are orthogonal 


29. 82.45° 

35. 63.43°, 116.57° 


Problem Set 9.3, page 383 

1. [0, 0, - 10], [0, 0, 10] 3. [-4, -8, 26] 5. [0, 0, -60] 

7. -20, -20 9. 240 11. [19, -21, 24], VI378 

13. [10, -5, -1] 15. 2 17. 30, -30 

19. -20, -20 25. [-2, 2, 0] X [4, 4, 0] = -16k, 16 

27. [1, - 1 , 2] X [1, 2, 3] = [-7, -1, 3], V59 
29. [0, 10, 01 x [4, 3, 0] = [0, 0, -40], speed 40 
31. | [7, 0, 0] x [1, 1, 0]| = 7 33. |V3 

35. [18, 14, 26]; 9x + 7y + I3z = c, 9 -4 + 7 • 8 + 13 -0 = 92 = c 
37. 16 39. c = 2.5 


Problem Set 9.4, page 389 


1. Hyperbolas 3. Hyperbolas 5. Circles 

7. Ellipses; 288, 100, 409; elliptic ring between the ellipses 


,2 


(2/3 f (1/2 f 


= 1 


and 


(4/3) z 


+ y 2 = 1 


9. Ellipsoids 11. Cones 13. Planes 

23. [8a-, 0, yz], [0, 0, xz], [0, 18s, av]; [0, z, y], [z> 0, a], [y, x, 0] 


Problem Set 9.5, page 398 

1. [4 + 3 cos t, 6 + 3 sin f] 3. [2 - r, 0, 4 + t\ 

5. [3, —2 + 3 cos t, 3 sin t] 7. [ a + 3/, b — 2 1, c + 5r] 

9. [V2 cos /, sin t, sin fl 11. Helix on (x — 2) 2 + (y — 6) 2 = r 2 

13. Circle (x - if + (y + 2) 2 = 1, z = 5 15. x 4 + y 4 = 1 

17. Hyperbola xy = 1 

23. r' = [—5 sin l, 5 cos t, 0], u = [—sin t, cos /, 0], q = [4 — 3w, 3 + 4 w, 0] 
25. r' = [si nh t, cosh f], u = (cosh 2t)~ m [sinh t, cosh /], q = [f + Aw, § + 5w] 
27. Vr'*r' = cosh t, ( = sinh 1 = 1.175 
29. Start from r(r) = [/, /(/)]. 

33. v = r' = H, 2 1, 0], |v| = Vl + 4/ 2 , a = [0, 2, 0] 

35. v(0) = la Ri, a(0) = -ofRj 

37. 1 year = 365 • 86400 sec, R = 30 • 365 • 86400/2 tt = 151 • 1 0 6 [km]. |a| = ofR 
= |v| 2 //? = 5.98 • 10" 6 [km/sec 2 ] __ 

39. R = 3960 + 80 mi = 2.133 • 10 7 ft, g = |a| = ofR = |v| 2 //?, |v| = V^R = 

V6.61 • 10 8 = 25700 [ft/sec] = 17500 [mph] 

43. r (/) = [/, y(r), 0], r' = [1, y', 0], r' • r' = 1 + y' 2 , r" = [0, y", 0], etc. 
47. 3/(1 + 9t z + 9t 4 ) 



A26 


App. 2 Answers to Odd-Numbered Problems 


Problem Set 9.6, page 403 

1. w' = 2V2(sinh 4/)/(cosh 4 1) 112 

3. v/ = (cosh f) sinh I_1 ((cosh 2 t) In (cosh t) + sinh 2 t) 

5. w' = 3(2/ 4 + t 8 ) 2 (8t 3 + 8/ 7 ) 7. e Ml sin 2 2v, \e* u sin Av 

9. —2{u z + v 2 )~\ — 2(« 2 + v 2 )~ z v 


Problem Set 9.7, page 409 


1. [2a, 2y] 

3. [1/y, -v/y 2 ] 

5. b f + 2, a 

7. [6, 4, 4] 

9. [-1.25, 0] 

1 

O 

i-H 

13. [-4, 2] 

15. [-18, 24] 

17. [48, -36] 

19. [6, 4] 

21. [-6, -12] 


23. [-0.0015, 0, - 

-0.0020] 

27. [a, b, c] 

29. [8, 6, 0] 

31. [108, 108, 108] 

33. V2/3 

35. 7/3 

37. 2<? 2 /Vl3 


39. |a 2 + §y 2 - 2z 2 

41. a 4 + y 3 - 3z 2 



Problem Set 9.8, page 413 

1. 3(.v + v) 2 3. 2(a + xz + z) 5. (y + x + 1) cos xy 

7. 9a 2 v 2 z 2 

9. [u l5 u 2 , t> 3 ] = r' = [a-', y\ z \ = [y, 0, 0], z' = 0, z = c 3 , y’ = 0, >• = c 2 , 
x — y = c 2 , x = c 2 t + ci. Hence as t increases from 0 to 1, this “shear flow” 
transforms the cube into a parallelepiped of volume 1 . 

11. div (w x r) = 0 because u 1 , v 2 , v 3 do not depend on x, y, z, respectively. 

13. (b) (fux) x + (fv 2 ) y + (fv 3 ) z = /[(Ui) x + (v 2 ) y + (u 3 )J + /*£>! + f y v 2 + f z v 3 , etc. 
(c) Use (b) with v = Vg. 

15. 4(a + y)/(y - a) 3 17. 0 19. e^iyh 2 + x 2 z 2 + a 2 /) 

Problem Set 9.9, page 416 

1. [0, 0, 4a - 1] 3. [0, 0, 2e x sin y] 5. [0, 0, -4v/(a 2 + y 2 )] 

9. curl v = [— 2z, 0, 0], incompressible, v = r' = [a ; , y , z] = [0, z 2 , 0], 
x — Cj, z — c 3 , y' = z 2 = c 2 , y = c 3 t + c 2 

11. curl v = [0, 0, —2], incompressible, x = y, y = —a, z' — 0, z = c 3 , 

y dy + a dx = 0, a 2 + y 2 = c 

13. Irrotational, div v = 1, compressible, r = [cie 1 , c 2 e~ l , c 3 e l ] 

17. 0, 0, by - zx, yz ~ xy, zx - yz] 

19. 0, 0, 0, — 2yz 2 - 2za 2 - 2xy 2 


Chapter 9 Review Questions and Problems, page 416 


11. [-1, 9, 24] 

15.r0, 0, -740], [0, 0, -740] 

19. -495, -495 

23. If u x v = 0. Always 

27. 3.4 


13.0, [-43, 54, 3], [43, -54, -3] 
17. [-24, 3, -398], [114, 95, -76] 
21. 90°, 95.4° 

25. [w lt v 2 , -3] 

29. If y > %TT, §7T 
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33. 45/6 35. No 

37. 0, 2y 2 + (z + xf 39. [-1, 1, -1], [-2 z, -2x, -2 y] 

41. 0, 2x 2 + 4y 2 + 2z 2 + 4xz 43. 488/V3323 45. 0 

Problem Set 10.1, page 425 

1. F(r(r)) = [125r 6 , r 3 , 0], 16448/7 = 2350 3. 0 + 160 

5. F(r(0) = [cosh / sinh 2 t, cosh 2 / sinh /], 93.09 
7. F(r(r)) = [r, cos r, sin r], 677 
9. F(r(/)) = [cosh § 1, sinh \t, e m ], 0.6857 
11. F(r(0) = [e\ e t2 , e% e 2 + 2e 4 - 3 

15. 17/3 17. [36 7T, |(87t) 3 , 36tt] 

Problem Set 10.2, page 432 

1. sin xy, 1 3. — 0 

7. x 2 y + cosh z, 392 11. sinh ac 

15. ce a — ae b 17. \a 2 bc 2 


Problem Set 10.3, page 438 



15. x = \b, y = !/? 17. / x = bh z t\2, I y = b z h!4 

19. l x = (a + b)h 3 /24, l y = /x(a 4 - 6 4 )/(48(a - b)) 


Problem Set 10.4, page 444 

1. 2x 3 y - 2xy 3 , 81 - 36 = 45 3. 3* 2 + 3/, 1875tt/2 = 2945 

5. e°~ v — e x+y , — §e 3 + \e 2 + e~ x — | 7. 2x — 2y, —56/15 

9. 0 (why?) 11. Integrand 4. Ans. 40ir 

13. y from 0 to \x, x from 0 to 2. Ans. cosh 2 — | sinh 2 

15. y from 1 to 5 — x 2 . Ans. 56 19. 4<? 4 — 4 

Problem Set 10.5, page 448 

1. Straight lines, k 

3. x 2 /a 2 + y 2 lb 2 = 1 , ellipses, straight lines, [-b cos v, a sin v, 0] 

5. z = (c/o)Vx 2 + y 2 , circles, straight lines, [—acu cos u, —acu sin v, a 2 u] 

7. x 2 /9 + y 2 /\ 6 = z, ellipses, parabolas, [— 8« 2 cos v, —6m 2 sin v, 12m] 

9. x 2 /4 + y 2 /9 + z 2 /16 = l, ellipses, [12 cos 2 v cos n, 8 cos 2 v sin u, 6 cos v sin u] 
13. [10m, 10m, 1.6 - 4m + 2v], [40, -20, 100] 

15. [—2 + cos v cos m, cos v sin u, 2 + sin u], 

[cos 2 v cos m, cos 2 v sin n, cos v sin u] 


5. e xz + y, -2 

13. No 
19. No 
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17. [«, v, 3u 2 ], [0, — 6i>, 1] 

19. [u cos v, 3 u sin v, 3m], [— 9h cos v, —3m sin v , 3m] 

21. Because r u and r„ are tangent to the coordinate curves v = const and u = const, 
respectively. 

23. [m, v, m 2 + v z ], N = [-2m, —2v, 1] 

Problem Set 10.6, page 456 


1.-64 3.-18 5. — 12877 

7. 2 tt 9. %a 3 11. 17/i/4 

15. 140V6/3 17. 128tt\/2/3 = 189.6 

19. ^tt 2 (37 3/2 - 5 3/2 ) = 22.00 25. 2 77/1 

27. tt / i 4 /\/2 29. Trh + 2tt/i 3 /3 

Problem Set 10.7, page 463 

1. Sa 3 b 3 c 3 /21 3. 6 5. 42§tt 

7. 234-77 9. 2m 5 /3 11. /m 4 tt/2 

13. -tj/i 5 /10 17. 108-tt 19. 216t7 

21. 0 23. 8 25. 38477 


Problem Set 10.8, page 468 

1. Integrals 4 • 1 • 1 (x = 1), 4 • 1 • I (y = 1), -8 • I • 1 (z = I), 0 (x = y = z = 0) 
3. 2 (volume integral of 6y 2 ), 2 (surface integral over jc = 1). Others 0 
5. Volume integral of 6y 2 — 6x 2 is 0. 2 (x = 1), —2 (y = 1), others 0. 

7. F = [x, z], div F = 3, In (2), Sec. 10.7, F»n = |F||n| cos (f> 

= Vx 2 + y 2 + z 2 cos (f> = r cos <jf>. 

9. F = [x, 0, 0], div F = 1, use (2*), Sec. 10.7, etc. 

Problem Set 10.9, page 473 

1. [0, 8z, 1 6] • [0, -1, 1], ±12 

3. [— e* , -e*, e y V[-\, -1, 1], ±(e 2 - 1) 

5. S: [m, v, y 2 ], (curl F)*N = -4ve 2v \ ±(4 - 4e z ) 

7. (curl F)*n = 3/2, ±3a 2 /2 9. The sides contribute a, 3c 2 12, -a, 0. 

11. curl F = [0, 0, 6], 24tt 13. (curl F)*n = 2x - 2y, 1/3 

15. —tt/4 17. (curl F)»N = 7r(cos 7 rx + sin Try ), 2 

19. F»r = [—sin 6, cos 0] * [ — sin 8, cos 8] = 1, 2-77, 0 


Chapter 10 Review Questions and Problems, page 473 


11. Exact, -542/3 
17. By Stokes, ±18-77 
23. 0, 4<? / 3 77 
29. By Gauss, IOO -77 
35. Direct, 5(e 2 — 1) 


13. Not exact, e 4 - 1 
19. By Stokes, ±1277 
25. 8/7, 1 18/49 
31. By Gauss, 40abc 


15. By Green, 1 152 tt 
21. 4/5, 8/15 
27. Direct, 5 
33. Direct, 77/7 
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Problem 

Set 11.1, page 485 




3. 

277/n, 277//?, fe, /://?, /://? 





1 


2 

/ 

l 

i 


\ 

13. 


+ 

— 

| COS X — 

— cos 3a: + 

— cos 5a- - 

• 4 • • 

• 


2 


77 

\ 

3 

5 


/ 


77 


4 

/ 

1 

1 


\ 

15. 

— 

+ 

— 

1 COS X + 

— cos 3a: + 

— cos 5a: 

4 • • 

• 

2 


77 

9 

25 


/ 


77 


2 

/ 

1 

1 


\ 

17. 

— 

— 

— 

1 COS X + 

— cos 3a: + 

— cos 5a: 

4 • • 

• 


4 


77 

\ 

9 

25 


/ 





1 

1 





+ 

sin . 

x - 

- — sin 2x + — sin 3a; 
2 3 

- + • • • 





4 

/ 

1 

1 

cos 3x 4- — 


\ 


19. 

— 

— 

COS X + — 

cos 5a - + • 

• • 




77 

V 

9 

25 


/ 




/ 


1 

1 . 

\ 




+ 

21 sin . 

x + — sin 

i 3x + — sin 

5a- + • • •) 




1 



/ 

1 

1 


\ 

21. 

¥ 

77 2 

— 

4 1 cos x - 

■ — cos 2x 4* 
4 

— cos 3x 
9 

- + • 

••) 

23. 

i 



4 

1 

4 

3x 4 

1 

— 

77 2 

— 

— cos X * 

_ cos 2x+ _ _ cos 

— cos 


6 



77 

2 

2777 


8 


29. /' = 2x, /" = 2J X = 0 ,j[ = -477, yi' = 0, a n = — (- — J {-Ait) cos / 177 , etc. 

/i 77 \ n / 


Problem Set 11.2, page 490 


'■H 


3. - - 


S. Rectifier. 


7. Rectifier, 


„ 2 4 
*■ 3 

lU-4, 


m 1 3 m 1 . 5m: \ 

— + — sin — h — sin — — + • • ■ J 

2 3 2 5 2 / 

/ 1 1 \ 

2 I cos m — — cos 2m + — cos 3m — I- • ■ ■ I 

2 4 ( 1 1 1 \ 

, I — — cos 2m 4- — — — cos 4m 4- — — cos 6m 4- • • • I 

tt tt \ 1 • 3 3 • 5 5*7 / 

14/ 1 1 \ 

, — g I cos m + TT cos 3m + — cos 5m + • • • ) 

2 77 \ 9 25 / 


i „ i i 

cos m cos 2m + — cos 3m — — cos 4m + 

l 4 9 16 


(me 1 

COS — h - 

\ 2 2 


1 1 3mc 1 5m: 1 

— cos me + — cos — — I- — cos — 1- — cos 3m 

2 9 2 25 2 18 


,, 3 1 „ 1 

13. — + — cos 2* + — cos 4 jt 

o 2 o 


15. Translate by 


17. Set x = 0. 
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Problem Set 11.3, page 496 

1. Even, odd, neither, even, neither, odd 
3. Odd 
5. Neither 
7. Odd 
9. Odd 

7T 4 / 1 1 \ 

11. — H I cos x 4- —■ cos 3% 4 — cos 5x + • • ♦ I 

2 7T \ 9 25 / 


( s 


1 1 

13. — l sin jc — — sin 3x 4- — sin 5 jc — f- 


•) 


4 

/ 77% 

i 

377% 

1 

577% \ 

7T 

i sin T 

+ — sin 
3 

2 

+ — sin 

- + - 


17. (a) 1. (b) 


TT 


/ 77% 

l 

377% 

1 

5 TO \ 

\ sin T 

+ — sin 
3 

2 

+ 5 

sin ^ 4* • • • 1 

2 / 


8 

/ 77% 

1 

3 77% 

i 

577% \ 

4 9 

COS — -h 

— cos 

— 7 — 4* 

— cos 

— — + * 

TT 2 

\ 2 

9 

2 

25 

2 / 


(b) 


4 

I 77% 

i 

1 

377% 

1 . \ 

TT 

\ sin T 

+ — sin to + 
2 

— sin 
3 

2 

+ — sin 2 to + • • • 1 


3 to 1 5 toc 

cos — 1- — cos — — 

2 5 2 


1 3 to: 1 5 to I 

sin to + — sin — 1- — sin — — sin 3 to + 

3 2 5 2 9 


3 2 / to 1 

2i - (a) i ~^r s T“ i 

6 / 77 X 1 

(b M sin T-3 

L 4 L ( 7 tx 1 3m 1 577% \ 

aw T" 7 rT t 9 “T t 25 C0t T + ’''J 

2 L ( 77 X 1 277% 1 377% \ 

(b) vl sin T‘i si " _ r + T si ”"r- + --7 

) 

) 


1 7 TO 

— COS — h - 

7 2 


TT 4 / 

• (a) 2 + 7 [ 

(b) 2 1 sin x + y si 


I 1 

cos x + — cos 3x + — cos 5x + 


sin 2x + — sin 3x + 
3 


Problem Set 11.4, page 499 
3. Use (5). 


50 ( — 1 V* 


9./ 2 


(~l) n • 

' 7 in 


n=-oc 

WnfcO 


13. 77 + Z 2 < 


n=-oc 

n# 0 


2/ 00 1 

7. — y 

it ^ 2n + 1 

n=— oo 


J2n+l)ix 


n. y- + 2 2 

n=-oo 


(- 1 )” 
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Problem Set 11.5, page 501 

3. (0.05rc) 2 in D n changes to (0.02/i) 2 , which gives C 5 = 0.5100, leaving the other 
coefficients almost unaffected. 

5. y = c x cos cot 4* c 2 sin cot 4 A(co) cos t, A(o>) = 1 /(<o 2 — 1) < 0 if <o 2 < 1 (phase 
shift!) and > 0 if co 2 > 1 

N 

7. y = c± cos (ot 4 c 2 sin cot 4 2 — 9 n 9 cos nt 

n- 1 " “ " 


9. y = c 1 cos <ot 4 c 2 sin cot 4 4 


77 4 

2 CO 2 77 


(4 


— cos t -I 5 — — COS 3/ 

1 (O 2 - 9 


11 . y = Cl cos ^ + c 2 sin »/ + ^ 


3 • 5(w 2 - 16) 

13. The situation is the same as in Fig. 53 in Sec. 2.8. 
3c 8 

I 5 * >' = ~ 7 T~T~F ~9 cos 3/ — — . „ » sin 3 1 

64 + 9c 2 64 + 9c 2 


cos 2t 


l7.y-2 (-^ 

n=l ' 


(1 — tt 2 ) 6 n \ _ , „ , 

cos nt 4 — sin nt), D n = (1 — n 2 ) 2 + n 2 c 2 

Dn / 


19. /(/) = 2 (A„ cos nt + sin /if), A n = (-l) w+1 ^ > 

71=1 ” 

2400 

B n = (-l) n+1 , D n = (10 - n 2 ) 2 + 100rt 2 

nDr, 


Problem Set 11.6, page 505 

/ 1 (-l) w+1 \ 

1. F = 2 1 sin a- - — sin 2x + • • • + - 5 — ^ sin TVa I , E* = 8.1, 5.0, 3.6, 2.8, 2.3 

3. F = — — — I cos a + — cos 3a 4- — cos 5a + • • • 1 , E* = 0.0748, 0.0748, 
0.0119,0.0119,0.0037 

2 4/1 1 1 \ 

5. F = I - — — cos 2a + - — — cos 4a 4- — — cos 6a + 1, 

7T Tt \ 1 -3 3-5 5-7 ) 

E* = 0.5951, 0.0292, 0.0292, 0.0066, 0.0066 

4/1 1 \ 

7. F = — I sin a 4- - sin 3a + y sin 5x +■■■], E* = 1.1902, 1.1902, 0.6243, 

0.6243, 0.4206 (0. 1 272 when N = 20) 

9. ^ (sin a 4- 4- sin 3a + sin 5a + • • E* = 0.0295, 0.0295, 0.0015, 


0.0119, 0.0119, 0.0037 


5. F = 


2 4 


.1902, 1.1902, 0.6243, 


0.0015, 0.00023 
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Problem Set 11.7, page 512 


1- fix) = ire x (a* > 0) gives A = I e v cos wu do = * , # = — - — ^ 

(see Example 3), etc. J ° 1+w 1 + w 

3. /(a) = gives A = 1/(1 4 w 2 ). 

5. Use / = (tt/2) cos u and (1 1) in App. 3.1 to get A = (cos (7 tw/ 2))/(1 — w 2 ). 

2 r 00 sin aw cos aw 2 r 00 cos w 4- w sin w — 1 

7. — aw 9. — « cos xw dw 


cos wi> rfy = ~ , # 

1 4 w 2 


w 

1 H- w 2 


. 2 r sin aw cos aw 

7« J 

77 Ja vi; 


77* ■'o W 

2 r°° COS 7TW 4 1 


2 C COS 

“• 7 I — 


- dw 

cos aw dw 


2 r cos w 4 w sin w — 1 
9. — 5 


2 r 00 sin iTW 

15. -J — 

7T J 0 1 “ 

2 r°° vra • 

19. -J 

TT J 0 


2 f“ 7TVV — sin 7TVV 


77 •'a 


sin xw dw 19, 


sin a w dw 


2 r 00 wa — sin wa 


sin aw dw 


Problem Set 11.8, page 517 
. n ( sin 2w — 2 sin w 




5. Vrfl e~ x (x > 0) 


7. Vir/2 cos vi' if 0 < vv < w/2 and 0 if w > 77/ 2 9. Yes, no 

11. V(2/'7r) iv/(w> 2 + 7T 2 ) 

13. S^OkT* 2 ' 2 ) - ^(-(e - ^®)') = wSF^e - * 2 ®) = w - " 2 ® 

17. S? c (/') = SF c (-a/) = -affe(/) = - J— ~z~T — 2 = v^stf) - J — ' 1. 

I Y 77 « 4 W V 77 

w&rhr 

19. In (5) for /(ax) set ax = y. 


J ~ ^2 , ,.,2 


Problem Set 11.9, page 528 

3. ik(e~ iblv - l)/(V2nw) 5. Y 

7. [(1 + /V)<T i ”’ - l]/(V2^u' z ) 9. Y 

11. le - "’ 2 ® _ 

13. (e ,bw — e~ tbw )l(iwy/2rr) = V2/7r(sin bw)hv 


5. \/ (2/ 7r)k (sin w)/w 
9. V(2/ir)i(cos w — l)/n> 


Chapter 11 Review Questions and Problems, page 532 


ii Ak / • 

11. — I si 

TT \ 


1 1 

sm 7TX 4 y sin 3m* 4 — sin 5m* 4 


( • x 

1 

i 

3a 

1 

1 5x \ 

sin — 

— — sin a 4 

— sin — 

— - sin 4x + 

— sm 1- • • • 

V 2 

2 

3 

2 

4 

5 2 / 


8 / . mr 1 3m* I 5m \ 

1S - 1? ( s,n T “ 9 T + 25 a " T ■ + ■ ■ 7 

2 4 / 1 1 1 \ 

17. ~l - r cos 16m- + — — cos 32m + — — cos 48m- + • • • ) 

n tt \ 1*3 3*5 5-7 / 
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77 1 2 l 

19. — - cos 2x + — cos 4a 
12 4 


7 - cos 6 a- + - 7 - cos 8a — V 
9 16 


21. tt/ 4 by Prob. 1 1 


23. 7t 2 /8 by Prob. 15 


25. j [/(a) + /(-a)], j [/(a) - /(-a)] 


8 

/ A 

1 

3x 

1 5a \ 

— 

COS h 

— cos 

— 4- 

— cos 1- • • • 

77 

\ 2 

9 

2 

25 2 / 


29. 8.105, 4.963, 3.567, 2.781, 2.279, 1.929, 1.673, 1.477 


31. y = Ci cos cot + C 2 sin cot + 
1 cos 4 1 


-■) 


7T 2 / cos / 1 

3ft > 2 \ or 2 — 1 4 


cos 2/ 1 cos 3* 

+ — 


w 2 - 4 9 w 2 - 9 


33 


16 w 2 - 16 

1 [ x (cos w + w sin w — 1) cos wa + (sin w — w cos w) sin via 


— J 

TT 


dw 


4 r 
37.- I 

7 T J 0 


IT J o 
2 J *°° 

77 •'o W“ 

4 f 00 sin 2w — 2 w cos 2w 


w 


2 r w — sin w cos w 
35. — ( * sin wa dw 

77 Jn 


W 


,3 


COS WA* dw 


39. I — 


7 r n> 2 + 4 


Problem Set 12.1, page 537 


1. 

U = 

Ci(A') cos 4y + c 2 (x) sin 4)' 

5. 

It = 

c(x)e~ y + e^Kx +1) 

9. 

u = 

Ci(A)y + c 2 {x)y~ 2 

15. 

c = 

1/4 

19. 

77/4 


27. 

u = 

110 — (1 10/ln 100) In (a 2 - 


3. u = Cj(a) + c 2 (a))> 

7. u = c(a) exp (|y 2 cosh a) 
11 . u = c(x)e v + /?()■) 

17. Any c 
21. Any c and ft) 
y 2 ) 29. u = C]A + c 2 (y) 


Problem Set 12.3, page 546 

1 . k cos 2-jTt sin 2tjx 
3. 


1 1 

cos 77 / sin 77 a + — cos 37 Tt sin 37 ta + yyy cos 5irt sin 5 77 a + 


8 k ( 

3 * 77 3 X 

■ 

+ — (V2 + 1) cos 377 ; sin 3tta — • • • j 


1 1 
COS 77 / sin 77 A - — COS 377 / Sin 377 A + COS 577 / sin 577 A - + 

9 25 

i) cos 77 / sin 77 A + cos 277/ sin 2 tta 


7 , 
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2 / i 

9. — g ( “ V2) cos 7 rt sin ttx — — (2 + V2) cos 37 Tt sin 377* 4* 
77 \ 9 

— (2 + V2) cos 5 7Tt sin 5m - + • • • ) 

25 / 


8 L 2 ( (nfl m I /3ir\ 2 

17. u = — 5 - I cos c — I r sin — + -=■ cos cl — 1 1 

^ 3 \ U / J ^ 3 3 \ L ) J 


2 TTX 

sin — h 


19. (a) «(0, /) = 0, (b) u(L, t) = 0, (c) 11,(0, /) = 0, (d) u x (L, t) = 0. C = —A, 

D = — B from (a), (b). Insert this. The coefficient determinant resulting from (c), 
(d) must be zero to have a nontrivial solution. This gives (22). 


Problem Set 12.4, page 552 

3. c 2 = 300/[0.9/(2 • 9.80)] = 80.83 2 [m 2 /sec 2 ] 
11. Hyperbolic, u = f x (a) + f 2 (x + y) 

13. Elliptic, u = fx(y + 3ia) + f 2 (y — 3/a) 

15. Parabolic, n = a/ 1 (a — y) + f 2 (x — y) 

17. Parabolic, u = xf x (2x + y) + f 2 (2x + y) 
19. Hyperbolic, u = (l/;y)/i(Ay) + f 2 (y) 


Problem Set 12.5, page 560 
5. u = sin 0.4tta- e- 175216 ^ 100 

7. ii = — (— sin 0.1m- e -° 01752 ^ + | sin 0.2m c" 001752 ®^ ) 

TT \TT 2 ) 


9. u = 


20V2 


TT 


|sin 0. 1 m e 0,1 


1 


— 0.01752 7r 2 t + ± gin Q 3wc O.Ol 752(3 7 t) 2 I 




11. ii = uj + u n , where u n = « — Uj satisfies the boundary conditions of the text, so 

4^ nm . 2 r L nm 

that «i, = 2 B n sin —7 — e (c L) ^, = — I (/(at) - iij(a)] sin — — dx 

«=i L L 0 L 

13. F = A cos px + 6 sin pA, F^O) = Bp = 0, B = 0, F\L) = — Ap sin pL = 0, 
p = nirlL, etc. 

15. 11 = 1 


17. u = 


2tt 2 


1 


+ 4 1 cos x e * cos 2a e~ 4t + — cos 3a e~ 9t — h • 


TT 


1 


19. u = — + cos 2x e~ 4t 4- — cos 4a e 16t + — cos 6 a e 


,-361 


12 
Ktt “ 

23. - — X nB n . 
L »« 1 


-A ft 


25. w = e 


f I 

27. = 0, w " = -Ne^/c 2 , w = - — (1 - <T aL )A + 1 , 


so that w(0) = w(L) = 0. 

29. u = (sin |77A* sinh |7730/sinh 77 

31. « - — 2 


tt w=>1 (2n — 1) sinh (2 n - 1)tt 


. (2ra — l)m . (2n — l)u y 

sin — smh 


24 


24 
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4^ sinh (/im/24) nny 
33. u = A 0 x 4- 2^ — cos 


71= X 

,24 


sinh mr 


24 


*• ■ 2? J. M ^ = TT /„ /W L ” 24 

2 f a nm 

f (x) sin dx 

Jo a 


wry 

cos “~zt~ dy 


“ # mrx m fi7r{b - y) 

35. ^ A n sin sinh > Ajj . , > i / \ i 

, a a ci sinh -^o 

71“ 1 


Problem Set 12.6, page 568 

2 sin ap ^ 2 r x 

, B = 0, u = — 

777? 77 j q 


1. A = 


sin ap 


cos px e 


-c*pH 


dp 


r 00 

3. A = e~ p \ B = 0, m = cos px e -P~ cZ P 2t dp 

J Q 

r* 

5. Set Try = 5 . A = 1 if 0 < p/77 < 1, £ = 0, u = I cos px e~° 2pH dp 

J o 

f°° 

7. A = 2[cos p 4 p sin p — l)/(7jp 2 )], B = 0 y u = I A cos px e~ c2p2t dp 

•'n 


Problem Set 12.8, page 578 


1. (a), (b) It is multiplied by V2. (c) Half 
3. B mn - 16 /(mmr 2 ) if m, n odd, 0 otherwise 
5. B mn = (— l) n+1 8/(/n«ir 2 ) if m odd, 0 if m even 
l.B mn = (-l) w+n 4/(m/w 2 ) 

11 . k cos V29 t sin 2x sin 5y 


64 “ 1 

i3. ^4 s 2 -4r 

m, n odd 


COS (A/j 7Z 2 4 « 2 ) sin hu' sin ny 


17. cirV260 (corresponding eigenfunctions F 4>16 and F 16<14 ), etc. 
19. B mn = 0 (m or n even), i5 TOn = \6kKjnmr 2 ) (m, n odd) 

21. fl mn = (-l) m+n 144fl 3 fc 3 /(m 3 n 3 77®) 


23. cos 



sin 


37 rx 

sin 

a 


477J 

~ 


Problem Set 12.9, page 585 

7. 30/* cos 0 4 10/* 3 cos 3 0 

220 / 1 1 

9. 55 4 I r cos 0 — — /* 3 cos 30 4 — - r 5 cos 50 — f- * 

77 \ 3 5 

77 4 / 1 o 1 - 

11. — I /* cos 0 4 — r 3 cos 30 4 — /* 5 cos 50 4 • • ■ 

2 77 \ 9 25 

15. Solve the problem in the disk r < a subject to u 0 (given) on the upper semicircle 
and — w 0 on the lower semicircle. 

4u 0 ( r 1 „ 1 _ \ 

u = — sin 0 4 — -sr r 3 sin 30 4 — =■ r 5 sin 50 4 • • • I 

77 \ a 3 a 3 5 a 5 ) 

17. Increase by a factor V2 19. T = 6.82 6pR 2 f 1 2 
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21. No 

23. Differentiation brings in a factor l/A m = R/(ca m ). 

Problem Set 12.10, page 593 

11. v = F(r) G(r), F" + k 2 F = 0,6 + c 2 k 2 G = 0 ,F n = sin (nir r/R), 

„ „ , „ 2 r R nm- 

G n = B n exp ( -c 2 n 2 TT 2 t/R 2 ), B n = — r/(r) sin — — dr 

K J 0 K 

13. u = 100 15. w = |r 3 P 3 (cos <£) — |rP 1 (cos <£) 

17. 64r 4 P 4 (cos 0) 

21. Analog of Example 1 in the text with 55 replaced by 50 
23. v = r(cos d)/r 2 = x/(x 2 + y 2 ), v = xy/(x 2 -f y 2 ) 2 


Problem Set 12.11, page 596 

5. w = + s 2 (5 ' V + 1} , mo, s) = o, c(s) = 0, w(x, t) = JC(/ - 1 + e-*) 

7. w = f(x)g(t), xf'g + fg = xt, take /(x) = x to get g = ce~ l + / - 1 and 
c = 1 from w(x, 0) = x(c — 1) = 0. 

9. Set x 2 /(4c 2 t) = z 2 . Use z as a new variable of integration. Use erf(<») = 1. 


Chapter 12 Review Questions and Problems, page 597 


19. u 
23. u 
27. u 


C\(y)e x + c 2 {y)e~ 2x 

cos t sin x — § cos 2/ sin 2x 

sin (0.02 ttv) e -° 004572t 


21. u = g(x)(l - e v ) + f(x) 

25. « = f cos r sin x — 3 cos 3 1 sin 3x 


200 / . me 
H* V sm 50 
100 cos 4x e~ l 
tt 16 / I 

T _ \4 

•I 


29. u 

31. u = 100 cos 4x e~ 16t 
33. it 


-0.004572t 


- -sm 


3 me 
50~ 


--0.04115t 


cos 2x e 41 + — cos 6x e 361 4- 
36 


100 


cos lOx <r 100t 


37. u = fi(y) + f 2 (x + y) 39. u = f t (y - 2 ix) + f 2 (y + 2/x) 
41. u = xfi(y - x) -I- f 2 (y - x) 

49. u = («! - t/ 0 )(ln r)/ln (rj/rfo) + (u 0 In /-j - t/j In /- 0 )/ln (r t /r 0 ) 


Problem Set 13.1, page 606 

5. x — iy = — (x + />•), x = 0 
9. -5/169 11. -7/13 -(22/13)/ 

15. -7/17 - (1 1/1 7)/ 17. x/(x 2 + y 2 ) 


7.484 

13. -273 + 136/ 

19. (x 2 - j' 2 )/(x 2 + y 2 ) 2 


Problem Set 13.2, page 611 

1. 3V2(cos (—477) -I- / sin (—^77)) 
3. 5 (cos 77 + / sin 77) = 5 cos 77 


5. cos ^77 + / sin 



App. 2 Answers to Odd-Numbered Problems 


A37 


7. §V(fF(cos arctan f -I- / sin arctan f) 9. — 37 t/4 

11. arctan (±3/4) 13. ±tt/4 15. 3tt/4 

17. 2.94020 + 0.59601/ 19. 0.54030 - 0.84147/ 

21. cos (— \tt) + / sin (— \rr), cos \tt + / sin §7 r 

23. ±(1 ±/)/V 2 25. — 1 , cos ± / sin |-7r, cos § 7 r ± / sin fir 

27. 4 + 3/, 4 - 8/ 29. | - /, 2 + J/ 

35. |z x + z 2 | 2 = tei + z 2 )tei + s 2 ) = tei + s 2 )tei + 22 )- Multiply out and use 
Re ZiZ 2 = I 21 Z 2 I (Prob. 32): 

Z 1 Z 1 + iif 2 + 2221 + 2222 = |2l| 2 + 2 Re ZiZ 2 + |2 2 | 2 = M 2 + 2|z 1 ||s 2 | + |z 2 | 2 
= (I 21 I + 12 2 |) 2 . 

Take the square root to get (6). 

Problem Set 13.3, page 617 

1. Circle of radius f , center 3 + 2/ 

3. Set obtained from an open disk of radius 1 by omitting its center z = 1 
5. Hyperbola Ay = 1 7. y-axis 

9. The region above y = x 

13. f = 1 - l/(z + 1) = 1 - (a + 1 - /»/[( a* + l) 2 + y 2 ]; 0.9 - 0.1/ 

15. (a 2 - y 2 - 2ixy)/(x 2 + y 2 ) 2 , -i/2 17. Yes since /* 2 (sin 2 0)/r-> 0 

19. Yes 21. 6 z\z z + /) 

23. 2/(1 - z)” 3 

Problem Set 13.4, page 623 

1. Yes 3. No 5. Yes 

7. No 9. Yes for z ^ 0 

11. = x/r = cos ft = sin ft ft; = —(sin 0)//*, ft = (cos 0)/r , 

(a) 0 = - v y — u r cos 0 + M 0 (-sin 0)/r — u r sin 0 — v 0 (cos 9)h\ 

(b) 0 = u v 4- v x = u r sin 6 + w#(cos 0)/r + u r cos 6 + u#(— sin 0)/r. 

Multiply (a) by cos ft (b) by sin ft and add. Etc. 

13. z 2 I2 15. In \z\ + / Arg z 17. z 3 

19. No 21. No 23. c = 1 , cos jc sinh y 

27. Use (4). (5). and (1). 

Problem Set 13.5, page 626 

3. -1.13120 + 2.47173/, <? = 2.71828 5. -/, 1 

7. e 0-8 (cos 5 - i sin 5), 2.22554 9. e -2x cos 2 y, —e _2x sin 2 y 

11. exp (a -2 — y 2 ) cos 2xy, exp (a 2 - y 2 ) sin 2xy 
13. e iM , e 5 ™ 14 

15. Vr" exp [i(9 4- 2 kv)/n], k = 0, • • •, n — 1 

17. 9e m 19. z = In 2 + m + 2nm ( n — 0, ±1, • • •) 

21. z = In 5 — arctan f / ± 2 mri (n = 0, 1, • • •) 

Problem Set 13.6, page 629 

3. Use (11), then (5) for e w , and simplify. 5. Use (II) and simplify. 

7. cos 1 cosh 1 — / sin 1 sinh 1 = 0.83373 - 0.98890/ 
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9. 74.203, 74.210 
13. -1 

17. z = ±(2/i + l)7n72 
21. z = ±nrri 

25. Insert the definitions on the left, multiply out, simplify. 


11. -3.7245 - 0.51 182/ 

15. cosh 4 = 27.308 

19. z = §(2/1 + 1 )tt - (—1)” 1.4436/ 


Problem Set 13.7, page 633 

1. In 10 + 7n 3. | In 8 — ^iri 

5. In 5 + (arctanf — ir)i = 1.609 — 2.214/ 

7. 0.9273/ 9. | In 2 - Jiri 

11. ±(2n 4- l)7r/, // = 0, 1, • • • 13. In 6 ± (2/i + l)iri, n = 0, 1, • • • 

15. (7r — I ± 2mr)i, n = 0, 1, • • • 

17. In (r -2 ) = (±2/i + 1 )m, 2 In / = ±(4/i + l)7rz, n = 0, 1, • • • 

19. <?°- 3 (cos 0.7 + / sin 0.7) = 1.032 + 0.870/ 

21. <? 2 (1 + i)/V 2 23. 64(cos (In 4) + i sin (In 4)) 

25. 2.8079 + 1.3179/ 27. (1 + i)/Vl 


Chapter 13 Review Questions and Problems, page 634 


17. -32 - 24/ 
23. 6V2e 3lH/4 
29. (±1 ± i)/V 2 
35. f(z) = e s2 

41.0 


19. - \i 

25. ne-™ 12 
31. f(z) = 1/z 
37. (-at 2 + y 2 )/ 2 
43. 0.6435/ 


21. 5 - 3/ 

27. ±(2 + 2/) 

33. f(z) = (1 + i)z 2 
39. No 
45. -1.5431 


Problem Set 14.1, page 645 

1. Straight segment from 1 + 3/ to 4 4- 12 i 

3. Circle of radius 3, center 4 + / 5. Semicircle, radius 1, center 0 

7. Ellipse, half-axes 6 and 5 


9. Parabola y = from —1 — p to 2 
11 . e~ u (OSiS 2if) 

13. t + i/t (1 s t g 4) 

17. —a — ib + re~ u (0 = t € 2 if) 

21.0 
25. i/2 
29. 2 sinh | 

Problem Set 14.2, page 653 

1. 777 , no 3. 0, yes 

7. 0, yes 9. 0, no 

15. Yes, by the deformation principle 
21. iri 23. 2777 

27. (a) 0, (b) 7T 29. 0 


4/ 

15. / -I- (4 - 4/ 2 )/ (-1 £ r 3 1) 

19. i + |i 

23. iri + \i sinh 2v 

27. — 1 + / tanh |tt = — 1 + 0.6558/ 


5. 0, yes 
11. 0, yes 

19. iri by path deformation 
25.0 
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Problem Set 14.3, page 657 

1. —47 r 3. 47T 5. %iri 

7. 0 9. —m 11. m 

13. t r 15. 2 tt7 Ln 4 = 8.710/ 

17. 7 r/cosh 2 (l + i) = 7r(— 0.2828 + 1.6489/) 


Problem Set 14.4, page 661 

1. 21 ml A 3. ~2irie^ 

7. 27n if |a| < 2, 0 if |«| > 2 
11. 2 tt 2 / 13. me al2 /2A if \a - 2 


5. ma 3 /3 

9. 7 n'(cos | — sin 5 ) 

/| < 3, 0 if | <2 — 2 — / 1 >3 


Chapter 14 Review Questions and Problems, page 662 

17. —67 77 19. 0 21. —7 77 

23. \i sin 8 25. 0 27. tt 29. 0 

Problem Set 15.1, page 672 

1. Bounded, divergent, ± 1 3. Bounded, convergent, 0 

5. Unbounded 

7. Bounded, divergent, ± 1/V2 ± /, 0, 1, —2 
9. Convergent, 0 

13. \zn - l\ < I z*n - l * I < |e (n > /V(e)), hence |z„ + zt - (/ + /*)| < |e + 

17. Convergent 19. Divergent 

21. Conditionally convergent 23. Divergent by Theorem 3 

27. n = 1 1 00 4- 75/ | = 125 (why?); |100 + 75/| 125 /125! = 1 25 125 /[V250 tt (125/e) 125 ] 
= e 125 /V2507r = 6.91 • 10 52 


Problem Set 1S.2, page 677 

1. 2 a n z 2n = 2 a„(z 2 )”, |z 2 | < R = lim |a„/a H+1 |, hence |z| <V/?. 


3. -/, I 

5. — 1, e by (6) and (1 + ]/n) n — > e . 

7. 0, \b/a\ 

9. 0, 1 

11 . 0 , 1 

13. 3 - 2/, 1 

15. f, 1/^2 

17. 0, V2 

Problem Set 1S.3, 

page 682 


1.3 

3. V2 

5. V5/3 

7. 1/V7 

9. 1 



Problem Set 15.4, page 690 

1. 1 - 2z + 2z 2 - |z 3 + fz 4 - + • • •, R = 00 

3. e -2i (l + (z + 2i) + |(z + 2/) 2 + §(z + 2 Z) 3 + £(z + 2/) 4 + • • •, R = 00 

5. 1 - |(Z - ^7r) 2 + g^(z - f T?) 4 - 7^(z - £tt) 6 +-•••, /? = OO 

7 . 5 + |/ + |/(z - i) + (-£ + {i)(z - if - £(z - /) 3 + - ■ • •, R = V2 
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q i - i-,2 . i r 4 L-6 . . 

y, L 2<- ' 8^ 48 4 ' 384- + 


R = oo 


11. 4 (Z - 1) + 10(z - l) 2 + 16(z - I) 3 + 14(z - l) 4 + 6(2 - l) 5 + (2 - l) 6 
13. (2/V^)(2 - 2 3 / 3 + 2 5 (2!5) - 2 7 /(3!7) + •••), /? = 00 
15. 2 3 /(l!3) - 2 7 /(3!7) + 2 u /(5!11) - + •••, R « =c 
19. 2 + §2 3 + ^2 5 + ^2 7 + • • % /? = 


Problem Set 15.5, page 697 

1. Use Theorem 1. 

5. \z n \ ^ 1 and 2 1//2 2 converges. 
9. |z + 1 - 2/| ^ r < R = 4 
13. Nowhere 


3. = 1/Vtt > 0.56 
7. |tanh” |z|| § 1, l/(« 2 + 1) < l/n 2 
11. |zl g 2 - S (8 > 0) 

15. |zj S V5 - 8 (S > 0) 


Chapter 15 Review Questions and Problems, page 698 

11. «0, e 3 * 13. 1, i Ln[(l + 2 )/(l - 2 )] 

15. oo 17. 1 /Vtt, [1 — tt(2 — 20 2 ] -1 

19. 1/3 

21. -1 - (2 - Tri) - (2 - irif/21 , R = =o 

23. 2 + 4(2 + 1) + £(z + l) 2 + is(z + l) 3 + * • •, R — 2 

25. 1 + 32 + 6z 2 + 10z 3 + ••%/? = 1. Differentiate the geometric series. 

27. / + (z + i) - i(z + if - (z + if + • • •, R = 1 

29. -(2 - £ir) + jj- (z - ^7r) 3 - -JJ- (2 - gTr) 5 + -•••,/? = « 


Problem Set 16.1, page 707 


1111 , 

1. — T + ~ + — + 1 + 2 + z 2 +•••,/?= 1 


2 4 2 3 Z 2 


1 


3. — 5 - + r + tt z — rrr z 2 H — ♦••,/? = 00 


z 3 2 2 2z 6 24 

1111 
5. — =■ + — r + 7 + — o' + * • • 

2 3 2 5 2z 7 6z 9 

„ ^ (z - I)”" 1 

7. = e 2, i = e 

2-1 . n 

n=0 


120 

R = 00 
1 

2 - 1 


2 — 1 

+ , + — + 






(z 

/? = 2 

30 

11. - 2 ft + o n_1 = 


ill 1 1 1 

7 + 7 + 7 (z — 0 - TT (z - 0 - 

z — 1 4 8 16 


1 


n=0 


z -l- / 


7 - 1 - (z + 0 , R = 1 


13. - 


2 - 1 


+ 2 + (2 - 1) 
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is. 2 z 3 ”’ 1*1 < i. -2 jLs * 1*1 > 1 

»«0 n -0 4 

17 . 2 1*1 < 1.-2 - 4^2 . 1*1 > 1 


19. 


n -0 n =0 v 

H - r + / + (z — i) 


{z - /') 


Z — I 


2i. (i - 4 Z ) 2 * 4n - 1*1 < i. ~ “*) 2 i <1*1 > i 

11=0 ' Z f n =0 4 

* (-l)” +1 (s + |ir) 2n_1 . ,, 

23 . 2 L - - ~ — . |z + W > 0 


n -0 


(2n)! 


Problem Set 16.2, page 711 


1. ±|, ±|, • • • (poles of 2nd order), <* (essential singularity) 

3. 0, ± Vtt, ±V27t, • • • (simple poles), » (essential singularity) 
5. » (essential singularity) 

7. ±1, ±t (fourth-order poles), * (essential singularity) 

9. ±i (essential singularities) 13. —16/ (fourth order) 

15. ±1, ±2, • • • (third order) 17. ±//V 3 (simple) 

19. ±2/ (simple), 0, ±2m, ±4 tt/, • • • (second order) 


21. 0, ±277, ±47T, • • ■ (fourth order) 

23. f(z) = (z ~ Zo ) n g(z), g(z 0 ) * 0, hence f(z) = (z - Zo? n g\z). 


Problem Set 16.3, page 717 


1. i, 4 i 


3. - 3 / (at z = 2 i), \i (at —21) 

5. 1/5! (at s = 0) 


7. 1 (at ±nif) 

9. (at z = 1), i (at 2 

= - 1 ) 

11 . - 1 (at z = ±|«r, • • 0 

15. e Vz = 1 + \lz + • • • 

, Ans. 2 777 

17. Simple poles at 4ns. —4 i 

19. —4m sinh §ir 


21. —4/ sinh | 

23.0 


25 . 5 (at z — 2)1 2 (at z - 3). Ans. 5m 

Problem Set 16.4, page 725 


1 . 2ir/Vl3 

3. 2-77/35 

5. 2-77/3 

7.0 

9. 7 T 

11. 2-77/3 

13. -77/16 

15.0 

17. tt/2 

19.0 

21.0 

23. 77 

25.0 

27. —17/2 


Chapter 16 Review Questions and Problems, page 726 

17. 2-777/3 

19. 5-tt 

21. §7TCOS 10 

23. 77774 

25. 0 (7t even), (— I) (n “ 1J/2 2-7r//(« - 1)! (n odd) 

27. 6m 

29. 2-7777 

31. 4-77/V3 

33.0 

35. 77-/2 
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Problem Set 17.1, page 733 
3. Only the size 

5. A' = c, w = —y + ic, y = k, w = —k + ix 
7. -3^/4 < Arg w < 377/4, |vt>| < 1/8 9. |w| > 3 

11. \w\ ^ 16, u ^ 0 13. Annulus 3 < \w\ < 5 

15. In 2 ^ u ^ In 3, rr/4 ^ v ^ rr!2 17. ±1, ±i 

19. 0, ±1, ±2, • • • 21. -all 

23. a and 0, \ r a 25. M = e x = 1 when a = 0 ,J = e 2x 

27. M = l/|z| = 1 on the unit circle, 7 = l/|z| 2 


Problem Set 17.2, page 737 

—iw 

5. z = 

— 2vv + 3 



9. z = 0 
13. z = ±i 

17. a — d = 0, b/c = 1 by (5) 


11. z = \ + i ± Vi + 1 

IS. vv = 4/z, etc. 

19. w = add (a ± 0, d * 0) 


Problem Set 17.3, page 741 

5. Apply the inverse g of / on both sides of z x = /(z x ) to get gfe) = g(f(Zi)) = z x . 
7. vv = (z + 2 i)/(z — 2 i) 9. vv = z — 4 

11. w = 1 lz 13. w = (3iz + 1 )/z 

15. vv = (z + l)/(-3z + 1) 17. w = (2z - /)/(— /z - 2) 

19. vv = (z 4 — /)/(— tz 4 + 1) 


Problem Set 17.4, page 745 

1. Annulus 1 S |w| § e 2 

3. 1/Ve < |vv| < Ve, 3ttI4 < arg vv < 5-77/4 

5. 1 < |vv| < e, v > 0 

7. vv-plane without 0 

9. « 2 /cosh 2 I + o 2 /sinh 2 1 g 1, u 0 

11. Elliptic annulus bounded by w 2 /cosh 2 1 + u 2 /sinh 2 1 = 1 and 
« 2 /cosh 2 5 + u 2 /sinh 2 5 = 1 
13. ±(2n + l)ir/2, n = 0, 1, • • • 

15. 0 < Im / < -77 is the image of R under / = z 2 . Ans. e* = e z ' 1 
17. 0, ±i, ±2 i, ■ • • 

19. « 2 /cosh 2 1 + u 2 /sinh 2 1 S 1 , v < 0 
21. v < 0 

23. — I = «= 1 , o = 0 (c = 0), « 2 /cosh 2 c + u 2 /sinh 2 c=l(c#0) 
25. In 2 ^ „ s in 3, ir/4 S v S ir/2 


Problem Set 17.5, page 747 

1. vv moves once around the unit circle. 5. -5/3, 2 sheets 
7. — i/2, 3 sheets 9. 0, 2 sheets 
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Chapter 17 Review Questions and Problems, page 747 


11. u = |u 2 - 1, \v 2 - 1 
15. The domain between u = 
17. |w + || = \ 

23. 0, (±1 ± r)/V2 
27. 0, ±//V 2 
33. w — zf(z + 2) 

39. 1 + i± Vl + 2i 
45. iz 3 + 1 


13. |vv| = 20.25, |arg w| < tt/2 
v 2 and u = 1 — \v 2 

19. u = 1 21. |arg w\ < ir/4 

25. 77/8 ± mr!2, n = 0, 1, • • • 

29. w = iz 31. w = 1 Iz 

35. ±V2 37. 2 ± V6 

41. w = e 3z 43. z 2 /2k 


Problem Set 18.1, page 753 

1. 20* + 200, 20z + 200 3. 1 10 - 50*y, 1 10 + 25 iz 2 

5. F = (1 10/ln 2)Ln z 7. F = 200 - (100/ln 2) Ln z 

13. Use Fig. 388 in Sec. 17.4 with the z- and w-planes interchanged, and 
cos z = sin (z + \tt). 

15. 0> = 220 - no*? 


Problem Set 18.2, page 757 

1. u 2 — v 2 = e^icos 2 y - sin 2 >0, = 4e 2 *(cos 2 y — sin 2 j>) = —4> yy , V 2 4> = 0 

3. Straightforward calculation, involving the chain rule and the Cauchy-Riemann 
equations 

5. See Fig. 389 in Sec. 17.4. 3> = sin 2 * cosh 2 y — cos 2 * sinh 2 y. 

9. (i) <I> = Uj(l — *}’)• (ii) w — iz 2 maps R onto — 2 ^ u ^ 0, thus 
<I>* = l/id + \u) = t/xd + \{-2xy)). 

11. By Theorem 1 in Sec. 17.2 

13. $ = 10[1 - (1/t r) Arg (z - 4)], F = 10(1 + (i/n) Ln (z - 4)] 

15. Corresponding rays in the w-plane make equal angles, and the mapping is 
conformal. 


Problem Set 18.3, page 760 

3. (100/d)y. Rotate through 7t/2. 5. 100 — 2409 tv 

7. Re F(z) = 100 + (200/tt) Re (arcsin z) 9. (240 /tt) Arg z 
11. T 0 + (2/tt)(T 1 - T 0 ) Arg z 13. 50 + (400 /tt) Arg z 

Problem Set 18.4, page 766 

1. V = iV 2 — iK, 'F = —Kx = const, = Ky = const 
3. F(z) = Kz (K positive real) 

5. V = (I + 2/) K, F = (1 - 2 i)Kz 
7. F(z) = z 3 

9. Hyperbolas (* + 1 )y = const. Flow around a comer formed by x — —1 and the 
x-axis. 

11. y/(x 2 + y 2 ) = c or x 2 + (y - k) 2 = k 2 
13. F(z) = z/i'o + r 0 /z 
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15. Use that u' = arccos z is w = cos z with the roles of the z- and w-planes 
interchanged. 

Problem Set 18.5, page 771 
5. 1 — r 2 cos 20 

7. 2 (r sin 6 - \r 2 sin 26 + \r 3 sin 36 — I- • • •) 

9. | r 2 sin 26 - ^r 6 sin 66 

11. r 2 — 4(r cos 6 — Jr 2 cos 20 + cos 30 — f • • •) 

4 / 1 1 

13. — I r sin 6 - — r 3 sin 30 + — r 5 sin 50 — I- • • • 
it V 9 25 

Problem Set 18.6, page 774 

1. No; |z| 2 is not analytic. 3. Use (2). F(§) = ^ 5. $(4, —4) = — 12 

7. Use (3). $(1, 1) = -2 11. \F(e i0 )\ 2 = 2 - 2 cos 20, 0 = ir/2. Max = 2 
13. \F(z)\ = [cos 2 2x + sinh 2 2 y] m , z = ±i. Max = [1 + sinh 2 2] 1/2 = cosh 2 = 3.7622 

15. No 


Chapter 18 Review Questions and Problems, page 775 

11. $ = 10(1 - * + y), F = 10 - 10(1 + i)z 

13. (20/ln 10) Ln z 15. (10/ln 10)(ln 100 - In r) 

17. Arg z = const 19. {—Hit) Ln z 

23. 7(a\ .y) = x(2y + 1) = const 25. Circles (a — c) 2 + y 2 = c 2 

27. F(z) = Ln (z - 5). Arg (z - 5) = c 
27 T 

29. 20 + — (r sin 0 + r* sin 30 -I- r 5 sin 50 + • • • 

7T \ 3 5 


Problem Set 19.1, page 786 


1. 0.9817- 10 2 , -0.1010- 10 3 , 0.5787- 10" 2 , -0.1360- 10 5 
3. 0.36443/(17.862 - 17.798) = 0.36443/0.064 = 5.6942, 0.3644/(17.86 - 17.80) = 
0.3644/0.06 = 6.073, 0.364/(17.9 - 17.8) = 3.64, impossible 


5. 


0.36443(17.862 + 17.798) 0.36443 • 35.660 


17.862 2 - 17.798 2 


13.00 

2.28 


= 5.702, 


13.0 


319.05 - 316.77 
13 


2.2 S - 5J °’2l‘ 5J -f = 5 


12.996 

2.28 


= 5.7000, 


7. 19.95, 0.049, 0.05013; 20, 0, 0.05 

9. In the present calculation, (b) is more accurate than (a). 

11 . -0.126 • 10" 2 , -0.402 • 10 -3 ; -0.267 • 10“ 6 , -0.847 • 10 -7 
13. Add first, then round. 

15 . ^ = 3l + ei h-*. + *L _ + ...) „?L + *L_±.3 l 

a 2 a 2 + e 2 a 2 \ a 2 a 2 2 } a 2 a 2 a 2 a 2 ' 
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(fL 

«i\ 

/ « 


_ *2 

\a 2 

a 2 )/ 

f 

a 2 

01 

a 2 


Si M + 1^1 S fa + fa 


19. (a) 19/21 = 0.904761905, e chop = e round = 0.1905 • 10" 5 , 

er.chop = ground = 0.2106 • 10 -5 , etc. 


Problem Set 19.2, page 796 

1. g = 1.4 sin a, 1.37263 (= x s ) S. g = x 4 + 0.2, 0.20165 (= a 3 ) 

7. 2.403 (= Xg, exact to 3S) 9. 0.904557 (= x 3 ) 

11. 1.834243 (= a 4 ) 13. a 0 = 4.5, x 4 = 4.73004 (6S exact) 

15. (a) 0.5, 0.375, 0.377968, 0.377964; (b) 1/V7 = 0.377964473 
17.A n+1 = (2x„. + 7/x n 2 )/3, 1.912931 (= x 3 ) 

19. (a) Algorithm Bisect (/, a 0 , b 0i N) Bisection Method 

This algorithm computes an interval [a nJ b n ] containing a solution of f(x) = 0 
(/ continuous) or it computes a solution c n , given an initial interval [fl 0 , b 0 ] such 
that f(a 0 )f(b 0 ) < 0. Here N is determined by ( b — ci)/2 N ^ fr & the required 
accuracy. 

INPUT: Initial interval [a 0 , b 0 ], maximum number of iterations N. 

OUTPUT: Interval [a N , b N ] containing a solution, or a solution c n . 

For n = 0, 1, • • •, N — 1 do: 

Compute c n = + b n ). 

If f(c n ) = 0 then OUTPUT c n . Stop. [Procedure completed] 

Else continue. 

If f(a n )f(c n ) < 0 then 

&n + 1 and b n+l = c n . 

Else set a n+1 = c n and b n+1 = b n . 

End 

OUTPUT [a N , b N ]. Stop. 

[Procedure completed] 

End BISECT 

Note that [a N , £%•] gives (a N 4- b N )/ 2 as an approximation of the zero and (b N — a N )/ 2 
as a corresponding error bound. 

(b) 0.739085; (c) 1.30980, 0.429494 
21. 1.834243 23. 0.904557 

Problem Set 19.3, page 808 

1. L 0 (x) = -2x + 19, Li(x) = 2a - 18, p^x) = 0.1082a + 1.2234, 

Pi.(9A) = 2.2405 

3. 0.9971, 0.9943, 0.9915 (0.9916 4D), 0.9861 (0.9862 4D), 0.9835, 0.9809 
5. p 2 {x) = -0.44304a 2 + 1.30906a - 0.02322, p z (0.75) = 0.70929 
l.p 2 {x) = -0.1434a 2 + 1.0895a, p 2 (0.5) = 0.5089, p 2 (1.5) = 1.3116 
9. L 0 = -g(.v - 1)(a — 2)(a - 3), Lj = ^a(a - 2)(a - 3), = -^a(a - 1)(a - 3), 

Lz = £v(a - 1)(a - 2); p 3 (x) = 1 + 0.039740a - 0.335187a 2 + 0.060645a 3 ; 
p 3 (0.5) = 0.943654 (6S-exact 0.938470), p 3 (1.5) = 0.510116 (0.51 1828), 
p 3 ( 2.5) = -0.047993 (-0.048384) 
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13. p 2 {x) = 0.9461a- - 0.2868a(a - l )/2 = -0.1434a 2 + 1.0895a 
15. 0.722, 0.786 

17. S/ 1/2 = 0.057 839, Sf 3l2 = 0.069 704, etc. 


Problem Set 19.4, page 815 

9. [—1.39 (a — 5) 2 + 0.58 (a - 5) 3 ]" = 0.004 at a = 5.8 (due to roundoff; should be 0). 
11. 1 - |a 2 + ix 4 

13. 4 - 12a 2 - 8a 3 , 4 — 12a 2 + 8a 3 . Yes 
15. I - a 2 , — 2(a - 1) - (a - l) 2 + 2(a - l) 3 , 

-1 + 2(x - 2) + 5(a - 2) 2 - 6(a - 2) 3 
17. Curvature /'/(l + f' 2 ) 312 = f" if \f'\ is small 

19. Use that the third derivative of a cubic polynomial is constant, so that g" is 

piecewise constant, hence constant throughout under the present assumption. Now 
integrate three times. 

Problem Set 19.S, page 828 

1. 0.747 131 3. 0.69377 (5S-exact 0.693 1 5) 

5. 1.566 (4S-exact 1.557) 7. 0.894 (3S-exact 0.908) 

9.J n 2 + e h/2 = 1.55963 - 0.00221 = 1.55742 (6S-exact 1.55741) 

11. J m + e hl2 = 0.90491 + 0.00349 = 0.90840 (5S-exact 0.90842) 

13. 0.94508, 0.94583 (5S-exact 0.94608) 

15. 0.94614588, 0.94608693 (8S-exact 0.94608307) 

17. 0.946083 (6S-exact) 

19. 0.9774586 (7S-exact 0.9774377) 

21. a - 2 = /, 1.098609 (7S-exact 1.098612) 

23. a = + 1). 0.7468241268 (lOS-exact 0.7468241330) 

25. (a) M z = 2, M 2 * = 4, \KM 2 \ = 2/(12 « 2 ), n = 183, (b) / (iv) = 24/a 5 , 2m = 14 
27. 0.08, 0.32, 0.176, 0.256 (exact) 

29. 5(0.1040 - | • 0.1760 + § • 0.1344 - | • 0.0384) = 0.256 


Chapter 19 Review Questions and Problems, page 830 

17. 4.266, 4.38, 6.0, impossible 19. 49.980, 0.020; 49.980, 0.020008 

21. 17.5565 gsi 17.5675 23. The same as that of a. 

25. -0.2, -0.20032, -0.200323 

27. 3, 2.822785, 2.801665, 2.801386, 2.801386 

29. 2.95647, 2.96087 

31. 0.26, M 2 = 6, M 2 * = 0, -0.02 geSO, 0.24 gag 0.26 
33. 1.001005, -0.001476 g e g 0 


Problem Set 20.1, page 839 

1. a x = —2.4, x 2 = 5.3 
5. Xj = 2, a 2 = 1 


3. No solution 

7. a x = 6.78, a 2 = -11.3, a 3 = 15.82 
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9. a'i = 0, x 2 = ^ arbitrary, x 3 = 5?! + 10 
11. x 1 = ?!, jc 2 = t 2 , both arbitrary, x 3 = 1 .25?i — 2.25/ 2 
13. jc x = 1.5, jc 2 = —3.5, jc 3 = 4.5, x 4 = —2.5 


Problem Set 20.2, page 844 



Problem Set 20.3, page 850 

3. Exact 21.5, 0, —13.8 5. Exact 2, 1, 4 7. Exact 0.5, 0.5, 0.5 

9. (a) x (3)T = [0.49982 0.50001 0.50002], (b) x (3)T = [0.50333 0.49985 0.49968] 
11. 6, 15, 46, 96 steps; spectral radius 0.09, 0.35, 0.72, 0.85, approximately 
13. [1.99934 1.00043 3.99684] T (Jacobi, step 5); [2.00004 0.998059 4.00072] T 
( Gauss -Seidel) 

17. V306 = 17.49, 12, 12 19. Vl8 F = 4.24|*|, 4|A|, 4|A| 
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Problem Set 20.4 V page 858 


1. 12, VS = 7.87, 6, ft -1 |] 

5. 1.9, VL35 = 1.16, 1, [0.3 -0.1 

7.6. V 6, 1,[1 11111 ] 

13. k — 100 ■ 100 
17.46 ^6- 17 or 7* 17 
21. [-0.6 2.8] t 


3. 14, V50 = 7.07,4, [-1 1 f -|] 

0.5 1.0] 

11. ! A || x = 17, || A -1 1| x = 17, k = 289 

15. K = 1.2-^ = 1.469 

19. [0 If, [1 — 0.4] T , 289 

23. 27, 748, 28375, 943656, 29070279 


Problem Set 20.5, page 862 

1. -11.4 + 5.4* 3. 8.95 — 0.388* 

5.s = -675 + 90 r, y av = 90 km/h 9. 4 - 0.75a- - 0.125a 2 
11. 5.248 + 1.543a, 3.900 + 0.5321a + 2.021a 2 
13. -2.448 + 16.23a, -9.114 + 13.73a + 2.500a 2 , 

-2.270 + 1.466a -1.778a 2 + 2.852a 3 


Problem Set 20.7, page 871 

1.5SAi9 3. 5, 0, 7; radii 4, 6, 6 

5. |A - 4/| gV2 + 0.1, |A| § 0.1, |A - 9/| S V? 

7. Jx! = 100, ?22 = hs = I 

9. They lie in the intervals with endpoints a fJ ± (n - 1)10“ 6 . (Why?) 

11. 0 lies in no Gerschgorin disk, by (3) with >; hence det A = Aj • • • A„ =£ 0. 

13. p(A) ^ Row sum norm || A || M = max 2 M = max (1^1 + Gerschgorin radius) 

3 k 3 

15. Vl53 = 12.37 17. Vl22 = 11.05 19. 6 ^ A S 10, 8 ^ A S 8 

Problem Set 20.8, page 875 

1. q = 4, 4.493, 4.4999; |e| s 1.5, 0.1849, 0.0206 
3. q = 8, 8.1846, 8.2252; H S I, 0.4769, 0.2200 
5. q = 4, 4.786, 4.917; |e| S 1.63, 0.619, 0.399 

7. q = 5.5, 5.5738, 5.6018; |e| S 0.5, 0.3115, 0.1899; eigenvalues (4S) 1.697, 3.382, 
5.303,5.618 

9. y = Ax = Ax, y T x = Ax T x, y T y = AVx, 
e 2 ^ y T y/x T x — (y T x/x T x) 2 = A 2 — A 2 = 0 
11. q = 1, • • •, —2.8993 approximates —3 (0 of the given matrix), 

M g 1.633, • • •, 0.7024 (Step 8) 


Problem Set 20.9, page 882 



3.500000 

-1.802776 

0' 


~ 0.980000 

-0.441814 

0‘ 

1 . 

- 1.802776 

6.730769 

1.846154 

3. 

-0.441814 

0.870164 

0.371803 


0 

1.846154 

1.769230. 


0 

0.371803 

0.489836. 



App. 2 Answers to Odd-Numbered Problems 


A49 


5. Eigenvalues 8, 3, 1 


‘ 5.64516 

-2.50867 

0“ 


" 7.45139 

-1.56325 

O' 

-2.50867 

5.307219 

0.374953 

♦ 

-1.56325 

3.544142 

0.0983071 

0 

0.374953 

1.04762. 


0 

0.0983071 

1.00446. 

[ 7.91494 

-0.646602 

0 

1 





-0.646602 3.08458 0.0312469 

. 0 0.0312469 1.000482 . 



"18.3171 

0.881767 

0 


’18.3786 

0.39651 1 

0 


7 . 

0.881767 

8.29042 

0.360275 

5 

0.396511 

8.24727 

0.0600924 



_ 0 

0.360275 

1.39250 - 


- 0 

0.0600924 

1.37414 - 



’ 18.39 10 

0.177669 

0 

1 





0.177669 

8.23540 

0.0100214 





- 0 

0.0100214 

1.37363 

J 





”7.00224 

0.0571287 

0 


“7.00298 

0.0326363 

0 

1 

9 . 

0.0571287 

4.00088 

0.0249333 

* 

0.0326363 

4.00034 

0.00621221 


_0 

0.0249333 

0.996875 . 


_0 

0.00621221 

0.996681 

J 


'7.00322 

0.0186419 

0 






0.0186419 

4.0001 1 

0.00154782 





_0 

0.00154782 

0.996669 






Chapter 20 Review Questions and Problems, page 883 

17. [4 -1 2] T 19.16 -3 1] T 

21. All nonzero entries of the factors are 1. 


23. 


" 2.8193 
-1.5904 


-1.5904 

1.2048 


L -0.0482 -0.0241 

27. 15, V89, 8 
33.6 


-0.0482 
-0.0241 
0.1205J 
29. 7, V2l, 4 
35.9 


(4D-values) 


25. Exact [-2 1 2] T 


31. 14, V78, 7 

37. 11.5-4.4578 = 51.2651 


39. y = 1.98 + 0.98* 

41. Centers 1,1,1, radii 2.5, 1, 2.5 (A = 2.944, 0.028 ± 0.290/, 3D) 
43. Centers 5, 6, 8; radii 2, 1, 1, (A = 4.1864, 6.4707, 8.3429, 5S) 



■ 1.5 

-2.23607 

0 ‘ 


' 9.44973 

-1.06216 

0 

45. 

-2.23607 

5.8 

-3.1 

, Step 3: 

-1.06216 

4.28682 

-0.00308 


. 0 

-3.1 

6.7. 


. 0 

-0.00308 

0.26345 _ 
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Problem Set 21.1, page 897 

I. v = e x , 0.0382, 0.1245 (error of a 5 , .v 10 ) 

3. v = jc - tanh a (set v - x = u), 0.009292, 0.0188465 (error of a 5 , x 10 ) 

5. v = e x , 0.001275, 0.004200 (error of x s , x 10 ) 

7. v = 1/(1 - a 2 /2), 0.00029, 0.01 187 (error of x 5 , a 10 ) 

9. y = 1/(1 - a 2 /2), 0.03547. 0.28715 (error ofx 5 , a 10 ) 

11 . v = 1/(1 — x 2 /2); error — 10 -8 , —4 • 10 -8 , • • •, — 6 • 10 -7 , +10 -5 ; 

about 1.3 • 10 _a by (10) 

13. y = xe x \ error - 10 5 (for a = 1, • • -, 3) 19, 46, 85, 139, 213, 315, 454, 640, 889, 1219 
15. y = 3 cos a — 2 cos 2 a; error • 10 7 : 0.18, 0.74, 1.73, 3.28, 5.59, 9.04, 14.33, 22.77, 
36.80,61.42 

17. y = 1/(a 5 + 1), 0.000307, -0.000259 (error of a 5 , a 10 ) 

19. The errors are for E.-C. 0.02000, 0.06287, 0.05076, for Improved E.-C. -0.000455, 
0.012086, 0.009601, for RK 0.0000012, 0.000016, 0.000536. 

Problem Set 21.2, page 901 

3. y = e~ 01x2 \ errors 10 -6 to 6 • 10“ 6 

5. y = tan a; y 4 , • • y 10 (error- 10 5 ): 0.422798 (-0.48), 0.546315 (-1.2), 0.684161 

(-2.4), 0.842332 (-4.4), 1.029714 (-7.5), 1.260288 (-13), 1.557626 (-22) 

7. RK-error smaller, error • 1 0 5 = 0.4, 0.3, 0.2, 5.6 (for x = 0.4, 0.6, 0.8, 1 .0) 

9. 3-4 = 4.229 690, y 5 = 4.556 859, >- 6 = 5.360 657, 3-7 = 8.082 563 

II. Errors between -6 • 10 -7 and +3 • 10 -7 . Solution e x - x - 1 
13. Errors • 10 5 from a = 0.3 to 0.7: —5, —11, - 19, —31, —47 

15. (a) 0, 0.02, 0.0884, 0.215 848, 3-4 = 0.417 818, y 5 = 0.708 887 (poor). 

(b) By 30-50% 

Problem Set 21.3, page 908 

3. 3-! = e x , y 2 = — e x , errors range from ±0.02 to ±0.23, monotone. 

5. y[ = 3- 2 , 3-2 = -43 -j , y = y 1 = 1, 0.84, 0.52, 0.0656, -0.4720; 3- = cos 2a 
7. 33 = 4e~ x sin a, y 2 = 4e~ x cos a; errors from 0 to about ±0. 1 
9. Errors smaller by about a factor 1 0 4 
11. y - 0.198669, 0.389494, 0.565220, 0.719632, 0.847790; 

y = 0.980132, 0.922062, 0.830020, 0.709991, 0.568572 
13. y x = e~ 3x - e~ 5x \ y 2 = e~ 3x + e” 5 *; y, = 0.1341, 0.1807, 0.1832, 0.1657, 

0.1409; 3-2 = 1.348’, 0.9170, 0.6300, 0.4368, 0.3054 
17. You get the exact solution, except for a roundoff error [e.g., 3-1 = 2.761 608, 
y(0.2) = 2.7616 (exact), etc.]. Why? 

19. y = 0.198669, 0.389494, 0.565220. 0.719631, 0.847789; 
y' = 0.980132, 0.922061, 0.830019, 0.709988, 0.568568 

Problem Set 21.4, page 916 

3. 105, 155, 105, 1 15; Step 5: 104.94, 154.97. 104.97, 1 14.98 
5. 0.108253, 0.108253, 0.324760, 0.324760; Step 10: 0.108538, 0.108396, 0.324902, 
0.324831 
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7. 0, 0, 0, 0. All equipotential lines meet at the corners (why?), Step 5: 0.29298, 
0.14649, 0.14649, 0.073245 
9. — 3« u + w 12 = —200, « u - 3u i2 = —100 
11. U \2 = M 32 = 31.25, « 2 i = «23 = 18.75, ttjk = 25 at the others 

13. u 2 1 = M 23 = 0.25, w 12 = M 32 = —0.25, Uj k = 0 else 

15. (a) w u = -w 12 = — 66 . (b) Reduce to 4 equations by symmetry. 

«n = «31 = -«15 = -«35 = -92.92, «21 = -«25 = ~87.45, 

«12 = W32 = — «14 = — W34 = —64.22, «22 = — 1( 24 = —53.98, 

m 13 = M23 = u 33 = 0 

17. V5, = = 0.0849, i /^2 = u 22 = 0.3170. (0.1083, 0.3248 are 4S-values of 

the solution of the linear system of the problem.) 


Problem Set 21.5, page 921 

5. u 11 = 0.766, u 2 1 — 1.109, « 12 = 1.957, m 2 2 = 3.293 

7. A as in Example I, right sides - 2 , — 2 , —2, —2. Solution « u = w 21 = 1.14286, 
Uj 2 = W 22 = 1.42857 

11. -4« n + «21 + «12 = -3, w n - 4 h 2 i + «22 = -12, m u - 4 u x2 + u 2 2 = 0, 
2^23 “I" 2 m ^2 12 m 22 — 14, Wn — W 22 — 2, W 21 — 4, — 1. Here 

— 14/3 = — 1(1 + 2.5) with 4/3 from the stencil. 

13. b = [-380 -190, -190, 0] T ; u n = 140, u 21 = w 12 = 90, u 22 = 30 


Problem Set 21.6, page 927 

5. 0.1636, 0.2545 (t = 0.04, x = 0.2, 0.4), 0.1074, 0.1752 (/ = 0.08), 0.0735, 0.1187 
(t = 0.12), 0.0498, 0.0807 (t = 0.16), 0.0339, 0.0548 (t = 0.2; exact 0.0331, 0.0535) 
7. Substantially less accurate, 0.15, 0.25 (t = 0.04), 0.100, 0.163 (t = 0.08) 

9. Step 5 gives 0, 0.06279, 0.09336, 0.08364, 0.04707, 0. 

11. Step 2: 0 (exact 0), 0.0453 (0.0422), 0.0672 (0.0658), 0.0671 (0.0628), 0.0394 
(0.0373), 0 (0) 

13. 0.1018, 0.1673, 0.1673, 0.1018 (/ = 0.04), 0.0219, 0.0355, •••(? = 0.20) 

15. 0.3301, 0.5706, 0.4522, 0.2380 ( l = 0.04), 0.06538, 0.10604, 0.10565, 0.6543 
(/ = 0.20) 


Problem Set 21.7, page 930 

1. Forx = 0.2, 0.4 we obtain 0.012, 0.02 (t = 0.2), 0.004, 0.008 (t = 0.4), -0.004, 
-0.008 (/ = 0 . 6 ), etc. 

3. u(x, 1) = 0, -0.05, -0.10, -0.15, -0.075, 0 

5. 0.190, 0.308, 0.308, 0.190 (0.178, 0.288, 0.288, 0.178 exact to 3D) 

7. 0, 0.354, 0.766, 1.271, 1.679, 1.834, •••(? = 0.1); 0, 0.575, 0.935, 1.135, 1.296, 
1.357, ■■■({ = 0.2) 


Chapter 21 Review Questions and Problems, page 930 

17. y = tan x; 0 (0), 0.10050 (-0.00017), 0.20304 (-0.00033), 0.30981 (-0.00047), 
0.42341 (-0.00062), 0.54702 (-0.00072) 
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19. 0.1003349 (0.8 • 10 -7 ) 0.2027099 (1.6 • 10 -7 ), 0.3093360 (2.1 • 10" 7 ), 0.4227930 
(2.3- 10~ 7 ), 0.5463023 (1.8 • 10" 7 ) 

25. „y(0.4) = 1.822798, y(0.5) = 2.0463 1 5, v(0.6) = 2.284161, y(0.7) = 2.542332, 
y(0.8) = 2.829714, ,y(0.9) = 3.160288, >(1.0) = 3.557626 

27. yi = 3e~ 9x , y 2 = ~5e~ 9x , [1.23251 -2.05419], [0.506362 -0.843937], • • •, 

[0.035113 -0.058522] 

29. 1.96, 7.86, 29.46 

31. u(P u) = it(P 31) = 270, u(P 21) = u(P 13) = u(P 23) = u(P 33) = 30, 

“(^ 12 ) = u(P 32 ) = 90, u{P 22 ) = 60 

35. 0.06279, 0.09336, 0.08364, 0.04707 

37. 0, -0.352, -0.153, 0.153, 0.352, 0 if t = 0.12 and 0, 0.344, 0.166, -0.166, 
-0.344, 0 if t = 0.24 

39. 0.010956, 0.017720, 0.017747, 0.010964 if t = 0.2 

Problem Set 22.1, page 939 

3. / = 3(x x - 2f + 2(x 2 + 4 f - 44. Step 3: [2.0055 -3.9975] 1 " 

5. / = 0.5(a 1 - l) 2 + 0.7 (a- 2 + 3) 2 - 5.8, Step 3: [0.99406 -3.0015] 7 
7. / = 0.2( Xl - 0.2 f + a- 2 2 - 0.008. Step 3: [0.493 -0.01 1] T , 

Step 6: [0.203 0.004] 7 


Problem Set 22.2, page 943 


3. No 

13. / max = f(9, 6) = 36 


1. .v 3 , .v 4 unused time on M x , M 2 . respectively 

11. /max = m 5) = 10 

15. / min = /( 3.5, 2.5) = -30 

17. x x /3 + x 2 /2 ~ 100, a'x/3 + a 2 /6 ^ 80, / = 150a'x + 100a- 2 , 

/max = /(210, 60) = 37500 

19. 0.5ax + 0.75a 2 ^ 45 (copper), 0.5ai + 0.25a 2 g 30, / = 120 ax + 1 00a 2 , 
/max = /(45, 30) = 8400 


Problem Set 22.3, page 946 

1. /(120/11, 60/11) = 480/11 


/ 2100 200 \ 

3 - / (—’Wj= 78000 


5. Matrices with Rows 2 and 3 and Columns 4 and 5 interchanged 
7. /( 0, ^) = - 10 9. /( 5, 4, 6) = 478 


Problem Set 22.4, page 952 

1. /(4, 4) = 72 
7. /(I, 1,0) = 12 


3. /(20, 30) = 50 
9. f(i 0, £) = 3 


5. /( 10, 5) = 5500 


Chapter 22 Review Questions and Problems, page 952 

11 . Step 5: [0.353 -0.028] 7 . Slower 
13. Of course! Step 5: [ - 1 .003 1 ,897] T 

21. /( 2, 4) = 100 23. /( 3, 6) = -54 


25. /( 50, 100) = 150 
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Problem Set 23.1, page 958 



Edge 


e l e 2 e 3 



Vertex 

Incident Edges 

1 


2 


3 

^2? ^3 

4 

*4 


Problem Set 23.2, page 962 

1. 4 3. 5 5. 4 

9. The idea is to go backward. There is a v } <._* adjacent to u k and labeled k - 1, etc. 
Now the only vertex labeled 0 is s. Hence A(i; 0 ) = 0 implies v 0 = so that 
v 0 — Vi — • • • — v k ^. 1 - v k is a path s — » v k that has length k. 

15. No; there is no way of traveling along (3, 4) only once. 

21. From m to 100r», 10m, 2.5m, m 4- 4.6 

Problem Set 23.3, page 966 

1. (1, 2), (2, 4), (4, 3); L 2 = 6, L z = 18, L 4 = 14 

3. (1, 2), (1, 4), (2, 3); L 2 = 2, L 2 = 5, U = 5 

5. (1, 4), (2, 4), (3, 4), (3, 5); L 2 = 4, L 3 = 3, L 4 = 2, L 5 = 8 

7. (1, 5), (2, 3), (2, 6), (3, 4), (3, 5); L 2 = 9, L 3 = 7, L 4 = 8, L 5 - 4, L 6 = 14 


Problem Set 23.4, page 969 

2 I 

1. 1 ( 3 L = 12 3.4^2 L = 10 

4( 3-5 

5 


8 

5. 1 - 2 ( 5 

3(- 6 - 4 
7 


L = 28 
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2 

9. 1 - 3 - 4 ( L = 38 11. Yes 

5-6, 

15. G is connected. If G were not a tree, it would have a cycle, but this cycle would 
provide two paths between any pair of its vertices, contradicting the uniqueness. 

19. If we add an edge (u, v) to T, then since T is connected, there is a path u — * v in T 
which, together with (u, v), forms a cycle. 

Problem Set 23.S, page 972 

1. (I, 2), (1,4), (3, 4), (4, 5), L= 12 

3. (1, 2), (2, 8), (8, 7), (8, 6), (6, 5), (2, 4), (4, 3), L = 40 

5. (1, 4), (3, 4), (2, 4), (3, 5), L = 20 

7. (1, 2), (1, 3), (1, 4), (2, 6), (3, 5), L = 32 

11. If G is a tree 

13. A shortest spanning tree of the largest connected graph that contains vertex 1 

Problem Set 23.6, page 978 

1. 1 - 2 - 5, A/ = 2; 1 - 4 - 2 - 5, A/ = 2, etc. 

3. 1 - 2 - 4 - 6, A/ = 2; 1 - 2 - 3 - 5 - 6, A/ = 1, etc. 

5. f i2 = 4, /i 3 = 1, fi4 = 4, f 4Z = 4, / 43 = 0, / 25 = 8, / 35 = 1, / = 9 

7. f 12 = 4, / 13 = 3, f 24 = 4, f 35 = 3, jfg4 = 2, / 46 = 6, /gg = 1, / = 7 

9. {4, 5, 6}, 28 11. {2, 4, 6), 50 

13. 1 - 2 - 3 - 7, A/ = 2; 1 - 4 - 5 - 6 - 7, A/ = 1; 

1 - 2 - 3 - 6 - 7, A/ = I; / max = 14 

15. {3, 5, 7}, 22 17. 5 = { 1, 4}, cap (S, T) = 6 + 8 = 14 

19. If fij < Cij as well as /y > 0 


Problem Set 23.7, page 982 
3. (2, 3) and (5, 6) 

5. 1-2-5, A t = 2; 1 - 4 - 2 - 5, A, = 1; / = 6 + 2-M=9 
7. 1 — 2 — 4 — 6, A ( = 2; 1 - 3 - 5 - 6, A t = 1; / = 4 + 2+ l= 7 
9. By considering only edges with one labeled end and one unlabeled end 
17. S = (1, 2, 4, 5}, T = {3, 6), cap (5, T) = 14 


Problem Set 23.8, page 986 


1. No 3. No 5. Yes, 5= {1,4,5, 8} 

7. Yes; a graph is not bipartite if it has a nonbipartite subgraph. 

9. 1 - 2 - 3 - 5 


11. (1, 5), (2, 3) by inspection. The augmenting path 1 — 2 — 3 — 5 
gives 1 - 2 - 3 - 5, that is, (1, 2), (3, 5). 

13. (1, 4), (2, 3), (5, 7) by inspection. Or (1, 2), (3, 4), (5, 7) by the use of the path 


1 - 2 - 3 - 4. 
15.3 
25. K 3 


23. No; K s is not planar. 


19.3 



App. 2 Answers to Odd-Numbered Problems 


ASS 


Chapter 23 Review Questions and Problems, page 987 



21. Vertex 

Incident Edges 

23.4 

1 



2 

-*i. *3 


3 

—e 2 


25.4 


27. L 2 = 

29. 1 - 4 - 3 

1 

to 

II 

On 

33. / = 7 


Problem Set 24.1, page 996 

1. = 19, qw — 20, qy = 20.5 

5. q L = 69.7, q M - 70.5, q v = 71.2 
9. q L - 399, q M = 401, q v = 401 
13. * = 70.49, j = 1.047, IQR = 1.5 
17.0 0 300 


3. q L — 38, qyi — 44, q v — 54 
7. q L = 2.3, qM = 2.4, qy = 2.45 
11. x = 19.875, 5 = 0.835, IQR = 1.5 
15. * = 400.4, 5 = 1.618, IQR = 2 
19. 3.54, 1.29 


Problem Set 24.2, page 999 

1. 4 outcomes: HH, HT, TH, TT (H = Head, T = Tail) 

3. 6 2 = 36 outcomes (1, 1), (1, 2), • • •, (6, 6) 

5. Infinitely many outcomes S, S e S, S C S C S, • • • (5 = “Six”) 

7. The space of ordered triples of nonnegative numbers 
9. The space of ordered pairs of numbers 
11. Yes 

13. E = {S, S C S, S C S C S], E° = {S C S C S C S, S C S C S C S C S, • • •} (5 = “Six”) 


Problem Set 24.3, page 1005 

1. (a) 0.9 3 = 72.9%, (b) $ • § • § = 72.65% 

3 490 . 489 . 488 . 487 . 486 _ nn 3 w 
500 499 498 497 496 VU.JO /C 

5. 1 - ^ = 0.96 7. 1 - 0.75 2 = 0.4375 < 0.5 

9. P(MMM) + P(MMFAf) + P(MFMM) 4- P(FMMM) = | + 3-^ = ^ 
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11. ^5 + §§ — ^ = §§ by Theorem 3, or by counting outcomes 
13. 0.08 + 0.04 - 0.08 • 0.04 = 11.68% 

15. 0.95 4 = 81.5% 17. 1 - 0.97 4 = 11.5% 


Problem Set 24.4, page 1010 

3. In 40 320 ways 
7.210,70, 112, 28 
11. (g) = 635013559600 
15. 676000 

Problem Set 24.5, page 1015 

1. it = 1/55 by (6) 

5. No because of (6) 

9. P{X > 1200) = f 6[0.25 - (x - 
J 1.2 

11. it = 2.5; 50% 

17. X > b, X it b, X < c, X ^ c, etc. 

Problem Set 24.6, page 1019 

1. 2/3, 1/18 
5. 4, 16/3 

9. m = 1/0 = 25; P = 20.2% 

13. 750, 1, 0.002 

Problem Set 24.7, page 1025 

1. 0.0625, 0.25, 0.9375, 0.9375 
5. 0.265 

7. f(x) = 0.5 x e~ os /x\, f(0) + f(l) = e 
9. 1 - e~ 0 - 2 = 18% 

n 122 125 _3fl. 

286 ’ 286 > 286 ’ 286 

Problem Set 24.8, page 1031 

1. 0.1587, 0.6306, 0.5, 0.4950 
5. 16% 

9. About 23 
13. t = 1084 hours 


5. ( 2 3 °) =1140 

9. 9!/(2!3!4!) = 1260. Arts. 1/1260 
13. 1/84, 5/21 


3. it = 1/8 by (10) 

7. 1 - P(X ^ 3) = 0.5 

.5) 2 ] dx = 0.896. Arts. 0.896 3 = 72% 
13.it = 1.1565; 26.9% 


3. 3.5, 2.917 
7. $643.50 

ll.i^,(X-i)V20 

15. 15c - 500c 3 = 0.97. c = 0.0855 


3. 64% 

°- s (I.O + 0.5) = 0.91. Airs. 9% 
11. 0.99 100 = 36.6% 


3. 17.29, 10.71, 19.152 
7. 31.1%, 95.5% 

11. About 58% 


Problem Set 24.9, page 1040 

1. 1/8, 3/16, 3/8 3. 2/9, 2/9, 1/2 

5* fz(y) = 1/(^2 — o> 2 ) if a 2 < y < f3 2 and 0 elsewhere 

7. 27.45 mm, 0.38 mm 9. 25.26 cm, 0.0078 cm 
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13. Independent, fi(x) = 0.1<? if x > 0, / 2 (>0 = 0 Ae 0 ly if y > 0, 36.8% 

15. 50% 17. No 


Chapter 24 Review Questions and Problems, page 1041 

21. Q l = 22.3, Q m = 23.3, Q v = 23.5 23. x = 22.89, s = 1.028, s 2 = 1.056 

25. //, 77/, 777/, etc. 

27. /(0) = 0.80816, /(l) = 0.18367, f(2) = 0.00816 
29. Always B C A U B. If also A C #, then 5 = A U etc. 

31. 7/3, 8/9 33. 118.019, 1.98, 1.65% 

35. 0, 2 37. /jl = 100/30 

39. 16%, 2.3% (see Fig. 520 in Sec. 24.8) 


Problem Set 25.2, page 1048 

3 . 1 = p\\ — p) n ~ k y p — kfn , k = number of successes in n trials 
5.11/20 

7.1 = /(a), d(ln l)fdp = 1//; — (a — 1 )/( 1 — /;) = 0, p = \!x 

9. jl = x 11. 6 = nl 2 Aj = 1 /a 

13. 0 = 1 15. Variability larger than perhaps expected 


Problem Set 25.3, page 1057 

1. CONF 095 { 37.47 Sfig 43.87} 3. Shorter by a factor V2 

5. 4, 16 7. Cf. Example 2. n . = 166 

9. CONF 0 . 99 { 20.07 20.33} 11. CONF 099 {63.71 ^ s 66.29} 

13. c = 1.96, * = 87, s 2 = 87 • 413/500 = 71.86, k « cslV^i = 0.743, 
CONFo. 95 {86 ^ tx £ 88}, CONF 095 {0. 17 SpS 0.18} 

15. CONF 0 . 95 { 0.00045 S tr 2 S 0.00131 } 17. CONF 095 {0.73 S <r 2 S 5.19} 

19. CONF 0 95 {23 ^ <t 2 £= 548}. Hence a larger sample would be desirable. 

21. Normal distributions, means —27, 81, 133, variances 16, 144, 400 
23. Z = X 4 Y is normal with mean 105 and variance 1.25. 

Ans. P( 104 ^ Z ^ 106) = 63% 


Problem Set 25.4, page 1067 

1 . 1 = V7 (0.286 — 0)/4.31 = 0.18 < c = 1.94; do not reject the hypothesis. 

3. c = 6090 > 6019; do not reject the hypothesis. 

5. a 2 //? = 1, c = 28.36; do not reject the hypothesis. 

1. p< 28.76 or /x, > 31.24 

9. Alternative /x * 1000, t = V20(996 - 1000)/5 = -3.58 < c = - 2.09 (Table 
A9, 19 degrees of freedom). Reject the hypothesis p = 1000 g. 

11. Test p = 0 against p ^ 0. t = 2 . 1 1 <c = 2.36 (7 degrees of freedom). Do not 
reject the hypothesis. 

13. a = 5 %, c = 1 6.92 > 9 • 0.5 2 /0.4 2 = 14.06; do not reject hypothesis. 

15. t 0 = VlO-9- 17/19 (21.8 - 20.2)/V9 • 0.6 2 + 8 • 0.5 2 = 6.3 > c = 1.74 
(17 degrees of freedom). Reject the hypothesis and assert that B is better. 
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17. v 0 = 50/30 = 1.67 < c = 2.59 [(9, 15) degrees of freedom]; do not reject the 
hypothesis. 

Problem Set 2S.5, page 1071 

1. LCL = 1 - 2.58 • 0.03/V6 = 0.968, UCL = 1.032 
3. n = 10 

5. Choose 4 times the original sample size (why?). 

7. 2.58V0.024/V2 = 0.283, UCL = 27.783, LCL = 27.217 

11. In 30% (5%) of the cases, approximately 

13. UCL = np + 3Vnp(l — p), CL = up, LCL = np — 3V/j/>(1 — p) 

15. CL = p = 2.5, UCL = p 4- 3Vjtt = 7.2, LCL = p — 3\/p is negative in (b) and 
we set LCL = 0. 

Problem Set 25.6, page 1076 

1. 0.9825, 0.9384, 0.4060 
5. P(A; 0) = e~ 30 %l + 300) 

7. P(A: 0) * e~ 509 
11. (1 - 0) 5 , (1 - 0) 5 + 50(1 - 0) 4 
15. <t>((9 - 12 + |)/V 12(1 - 0.12)) 

17.(1 -i) 3 + 3-|(l = \ 

Problem Set 25.7, page 1079 

1. xo 2 = (30 - 50) 2 /50 + (70 - 50) 2 /50 = 16 > c = 3.84; no 
3. 41 

5. Xo 2 = 2.33 < c = 11.07. Yes 

7. <?j = npj = 370/5 = 74, Xo Z ~ 984/74 = 1 3.3, c = 9.49. Reject the hypothesis. 

9. Xo 2 = 1 < 3.84; yes 

13. Combining the results for .v = 10, 11, 12, we have K — r — 1 = 9 (r = 1 since we 
estimated the mean, *§§§§- = 3.87). Xo 2 — 1 2.98 < c = 1 6.92. Do not reject. 

15. Xo 2 = 49/20 + 49/60 = 3.27 < c = 3.84 (1 degree of freedom, a = 5%), which 
supports the claim. 

17. 42 even digits, accept. 

Problem Set 25.8, page 1082 

3. (£) 18 (1 + 18 + 153 + 816) = 0.0038 

5. Hypothesis: A and B are equally good. Then the probability of at least 7 trials 
favorable to A is | 8 + 8 • | 8 = 3.5%. Reject the hypothesis. 

7. Hypothesis p. = 0. Alternative p > 0, x = 1 .58, 
x = Vlo • 1.58/1.23 = 4.06 > c = 1.83 (a = 5%). Hypothesis rejected. 

9. x = 9.67, a- = 1 1.87, t 0 = 9.67/(I !.87/Vl5) = 3.15 > c = 1.76 (a = 5%). 

Hypothesis rejected. 

11. Consider^- = xj — p 0 . 

13. P(T S2) = 0.1% from Table A12. Reject. 


3. 0.8187, 0.6703, 0.1353 

9. 19.5%, 14.7% 

13. Because n is finite 
= 0.22 (if c = 9) 
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15. P(T g 15) = 10.8%. Do not reject. 

17. P(T S 2) = 2.8%. Reject. 

Problem Set 25.9, page 1091 

1. y = 1.9 + * 3. y = 6.7407 + 3.068 a- 5. y = 4 + 4.8a, 172 ft 

7.3- = -1146 + 4.32a 9. 3 - = 0.5932 + 0.1 138a, R = 1/0.1138 

11. q 0 = 76, K = 2.36V76/(7 • 944) = 0.253, CONF 0 . 9s {-1.58 -1.06} 

13. 3^ 2 = 500, 3-s^ = 33.5, k x = 0.067, 3 s y 2 = 2.268, % = 0.023, = 0.021 

CONF 0 95 {0.046 0.088} 


Chapter 25 Review Questions and Problems, page 1092 


21. A = 5.33, a 2 = 1-722 
25. CONF 0 99 { 19.1 ^ (i ^ 33.7} 
29. CONF 0 . 95 { 1.373 S p, ^ 1.451} 


33. c = 14.74 > 14.5; reject 


23. It will double. 

27. CONF 0 . 95 { 0.726 ^ M g 0.751 } 
31. CONF 099 {0.05 ^ a 2 § 10} 

/ 14.74 - 14.40 \ 

35. d> ; — = 0.9842 

V V0.025 / 


37. 30.14/3.8 = 7.93 < 8.25. Reject. 

39. v 0 = 2.5 < 6.0 [(9, 4) degrees of freedom]; accept the hypothesis. 
41. Decrease by a factor V2. By a factor 2.58/1.96 = 1.32. 

43. 0.9953, 0.9825, 0.9384, etc. 45. y = 1.70 + 0.55a 
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A3J Formulas for Special Functions 


For tables of numeric values , see Appendix 5 . 

Exponential function e x (Fig. 544) 

e = 2.71828 18284 59045 23536 02874 71353 


(1) eV = e x /e v = {e x f = 

Natural logarithm (Fig. 545) 

(2) In (a'v) = In x + In y\ In (x/y) = In x — In y, In (A a ) = a In a 
I n x is the inverse of e x 9 and e ln x = a, e~ ln x = e ln a/x) = 1/a*. 

Logarithm of base ten log 10 A or simply log x 

(3) log a* = M In a*, M = log e = 0.43429 44819 03251 82765 11289 18917 

(4) In* = -77 log a% = In 10 = 2.30258 50929 94045 68401 79914 54684 

M M 

log a is the inverse of 10 *, and I 0 log r = a, I 0 _log * = I/a. 

Sine and cosine functions (Figs. 546, 547). In calculus, angles are measured in radians, 
so that sin a* and cos a have period 2i r. 
sin a is odd, sin (—a) = —sin a, and cos a is even, cos (—a) = cos a. 
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(5) 

( 6 ) 

(7) 

( 8 ) 

(9) 

( 10 ) 

(ID 


( 12 ) 


1° = 0.01745 32925 19943 radian 
1 radian = 57° 17' 44.80625" 

= 57.29577 95131° 
sin 2 x + cos 2 x = 1 

sin (* + y) = sin x cos y + cos x sin )' 

sin (jc — >•) = sin x cos y — cos x sin y 

* 

cos (a 4- y ) = cos jc cos y — sin jc sin y 

cos (jc — y) = cos a* cos y + sin a sin y 

sin 2a = 2 sin a cos a, cos 2a = cos 2 a — sin 2 a 

c “ (f - *) 

■ ( ir 
sin — — a 

\2 



sin ( 7 t — a) = sin a, cos (it — a) = —cos a 

cos 2 a = |(1 + cos 2a), sin 2 a = |(1 — cos 2a) 

sin a sin y = |[— cos (a + y) + cos (a — y ) ] 

4 cos a cos y = |[cos (a *f y) + cos (a — y)] 

sin a cos y = |[sin (a 4- y) 4- sin (a — y)] 

u 4- n u — v 
sm u 4- sin u = 2 sm — - — cos — - — 

2 2 

u 4- v u — v 

cos u 4* cos v = 2 cos — - — cos 

2 2 

M + y w — v 
cos y — cos u = 2 sin — - — sin — - — 


(13) A cos x 4- B sin a = Va 2 T 5 2 cos (a ± S), tan 5 = ^ 

(14) A cos a 4- B sin a = Va 2 4- £ 2 sin (a ± 5), tan S = 


cos 5 
sin S 


cos S 
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Tangent, cotangent, secant, cosecant (Figs. 548, 549) 


(15) tan a = 


sin x 
cos a 


(16) 


tan (a 4- y) = 


COS A 

cot a = — : 

sin a 

tan a + tany 


sec a = 


l 


CSC A = 


1 


tan (a — y) = 


cos a sin A 

tan a — tan y 


1 — tan a tan y " *" 14- tan a tan y 

Hyperbolic functions (hyperbolic sine sinh a, etc.; Figs. 550, 551) 


(17) 

sinh x = \{e x - e x ), 

cosh* = %(e x + e *) 

(18) 

sinh* 

tanh a = , , 

cosh A 

cosh* 

coth A = . # 

sinh a 

(19) 

cosh a 4 sinh a = e x n 

cosh a — sinh a = e~ x 

(20) 

cosh 2 a — 

sinh 2 a = 1 

(21) 

sinh 2 a = |(cos1i2a - 1), 

cosh 2 a = |(cosh 2a 4- 1) 




Fig. 551. tanh x (dashed) and coth x 
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(22) 

sinh (a* ± y) = sinh a cosh y ± cosh a sinh y 
cosh (a ± y) = cosh a cosh y ± sinh a sinh y 


(23) 

tanh a ± tanh y 

tanh (x ± y) = , 

1 ± tanh a- tanh y 


Gamma function (Fig. 552 and Table A2 in App. 5). The gamma function T(a:) is defined 
by the integral 

(24) 

r(a) = [ e-'t"- 1 dt 
J o 

(a > 0) 


which is meaningful only if a > 0 (or, if we consider complex a, for those a whose real 
part is positive). Integration by parts gives the important functional relation of the gamma 
function , 

(25) T(a + 1) = aY(a). 

From (24) we readily have T(l) = 1; hence if a is a positive integer, say k , then by 
repeated application of (25) we obtain 

(26) T(k + l) = k\ (* = 0, 1, •••). 

This shows that the gamma function can be regarded as a generalization of the elementary 
factorial function. [Sometimes the notation ( a - 1)! is used for T(a), even for noninteger 
values of a , and the gamma function is also known as the factorial function.] 

By repeated application of (25) we obtain 



Fig. 552. Gamma function 
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and we may use this relation 


(27) 


r(«) = 


r(g + k + 1) 

a(a + 1) • • • (a + k) 


(a* 0 , - 1 , - 2 , •••) 


for defining the gamma function for negative a (# -1, —2, • • •), choosing for k the 
smallest integer such that a + k + 1 > 0. Together with (24), this then gives a definition 
of T (a) for all a not equal to zero or a negative integer (Fig. 552). 

It can be shown that the gamma function may also be represented as the limit of a 
product, namely, by the formula 


(28) 


r(a) = lim 

n— »oa 


n\jf_ 

a(a 4- l)(a + 2) • • • (a 4- n) 


(a *0, -!,•••)• 


From (27) or (28) we see that, for complex a, the gamma function F(a) is a merornorphic 
function with simple poles at a = 0, - 1, -2, • • • . 

An approximation of the gamma function for large positive a is given by the Stirling 
formula 

(29) T(a+l)«V2^^J 

where e is the base of the natural logarithm. We finally mention the special value 

(30) IX!) = Vtt. 

Incomplete gamma functions 

X X 

(31) P(a, x) = f e-H*- 1 dt, Q(a, x) = f e^t"- 1 dt (a > 0) 

j 0 J x 

(32) T(a) = P(a , x) + G(a, x) 

Beta function 

(33) B{x, y) = f f*- J (l - 0 v ~ l dt {x > 0, y > 0) 

•'o 

Representation in terms of gamma functions: 


(34) 


B(x, y ) = 


rwrpo 

Hjc + y) 


Error function (Fig. 553 and Table A4 in App. 5) 


(35) 


erf x = 



dt 


erf x = 


2 


V TT 



,-5 




(36) 
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erf (oo) = 1, complementary error function 
(37) 


erfc x = 1 - erf* = — — * - 


Vtt 'x 


r" 

J dt 


Fresnel integrals 1 (Fig. 554) 

(38) C(x) = f cos (t 2 ) dt, S(jc) = f sin (t 2 ) 

J o J o 

C(°°) = V7t/8, S(<») = Vw/8, complementary functions 


dt 


(39) 


c(x) = ~ C(x) = J cos (r 2 ) dt 

s(x) = /f “ S(a) = j sin (t 2 ) dt 

Sine integral (Fig. 555 and Table A4 in App. 5) 

c x sin t 

(40) Si(x) = —— dt 

J o t 



1 AUGUSTIN FRESNEL (1788-1827). French physicist and mathematician. For cables see Ref. [GRIJ. 
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Si(cc) = tt/2, complementary function 


(41) 


si(A-) = 


7 r 

T 



4 // 


Cosine integral (Table A4 in App. 5) 



(42) 

ci(x) = 

cos t 

dt 

L- t 

(a > 0) 

Exponential integral 




(43) 

Ei(x) = 

f e - * 

J x t 

(a > 0) 

Logarithmic integral 




(44) 

li(A) = 

r* dt 
■*0 Inf 



A3.i Partial Derivatives 


For differentiation formulas , see r'/w/dfe o/ /row* cover. 

Let z = /(a\ y) be a real function of two independent real variables, x and y. If we keep 
y constant, say, y = y l9 and think of x as a variable, then /(a*, y x ) depends on a* alone. If 
the derivative of /(a*, y x ) with respect to a- for a value x = x x exists, then the value of this 
derivative is called the partial derivative of f(x, y) with respect to x at the point (x 1? y t ) 
and is denoted by 


Other notations are 


V 

dx 

or by 

dz 

™ <™> 

fx(x i, yd 

and 

Zx(xi, }’i); 


these may be used when subscripts are not used for another purpose and there is no danger 
of confusion. 
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EXAMPLE 1 


We thus have, by the definition of the derivative, 


( 1 ) 


dx 


on, 2/1) 


= lim 
ax— ►<) 


f(xi + Ax, y x ) - f(x lt yi) 
Ax 


The partial derivative of z — f(x, y) with respect to y is defined similarly; we now keep 
x constant, say, equal to and differentiate f(x x , y) with respect to y. Thus 


( 2 ) 


ay 


tei ,2/1) 


dZ 

dy 


= lim 


tei.yi) 


Ai /->0 


/(* 1, 3’1 + Ay) - /(*!> }’l) 
Ay 


Other notations are f y (: c ls y x ) and z y (xi, y x ). 

It is clear that the values of those two partial derivatives will in general depend on the 
point y x ). Hence the partial derivatives dz/dx and dz/dy at a variable point (a*, 3 ?) are 
functions of x and y. The function dz/dx is obtained as in ordinary calculus by 
differentiating z = f(x , y) with respect to a*, treating y as a constant , and dz/dy is obtained 
by differentiating z with respect to y, treating x as a constant 


Let z = fix. v) = .v 2 y -f x sin v. Tlien 


"J - VJ o _ 

— = 2.vv + sin \\ — = x + x cos y. ■ 

dx ’ ' dy 

The partial derivatives dz/dx and dz/dy of a function z = /( a*, y) have a very simple 
geometric interpretation. The function z = /( x, y) can be represented by a surface in 
space. The equation y = y 1 then represents a vertical plane intersecting the surface in a 
curve, and the partial derivative dz/dx at a point (a 1? y x ) is the slope of the tangent (that 
is, tan a where a is the angle shown in Fig. 556) to the curve. Similarly, the partial 
derivative dz/dy at (x l5 y x ) is the slope of the tangent to the curve a* = x x on the surface 
z = /(a*, y) at (a' 1? y\). 



Fig. 556. Geometrical interpretation of first partial derivatives 
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EXAMPLE 2 


EXAMPLE 3 


The partial derivatives dz/dx and dz/dy are called first partial derivatives or partial 
derivatives of first order . By differentiating these derivatives once more, we obtain the 
four second partial derivatives (or partial derivatives of second order) 2 


( 3 ) 



= /« 


= /< 


yx 


= f: 


xy 




VV 


It can be shown that if all the derivatives concerned are continuous, then the two mixed 
partial derivatives are equal, so that the order of differentiation does not matter (see Ref. 
[GR4] in App. 1), that is, 

9 2 z d z z 

(4) = . 

dx dy dy dx 

For the function in Example 1 . 

f xx 2y, fxy 2.v + cosy — fyx> lyy “* x sin y. I 

By differentiating the second partial derivatives again with respect to x and y, 
respectively, we obtain the third partial derivatives or partial derivatives of the third 
order of /, etc. 

If we consider a function f(x, y, z) of three independent variables , then we have the 
three first partial derivatives f x (x, y, z), f y (x> y, z), and f 2 (x, y, z)- Here f x is obtained by 
differentiating / witli respect to jt, treating both y and z as constants . Thus, analogous to 
(1), we now have 


a/ 

dx 




— lim 

Ax->0 


f(*i + A-y, y± zi) - f(xi, .vx, zi) 
Ax 


etc. By differentiating f x , f y , f z again in this fashion we obtain the second partial 
derivatives of /, etc. 


Let /(x. y. z) = x 2 + y 2 + z 2 + xy e z . Tlien 


f x = 2x + yf, 
fxx = 


fy = 2y + x e z , 

fxy ~ fyx = ^ ' 
fyz~ fzy = X C . 


f z = 2z + xye ! , 
fxz = Arc = y «*, 

A* = 2 + xy e*. ■ 


2 CAUTION! In the subscript notation the subscripts are written in the order in which we differentiate, whereas 
in the notation the order is opposite. 
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A3.3 Sequences and Series 

See also Chap . 15. 

Monotone Real Sequences 

We call a real sequence jc lt a 2 , • • • , x n , • • • a monotone sequence if it is either monotone 
increasing, that is, 

A*1 = A*2 = A3 = ‘ * * 

or monotone decreasing, that is, 

A*! ^ A- 2 ^ A3 ^ ‘ . 

We call a*!, a* 2 » • • • a bounded sequence if there is a positive constant K such that |xj < K 
for all n. 


THEOREM 1 


If a real sequence is bounded and monotone , it converges. 


PROOF Let a*!, a* 2 ? • • • be a bounded monotone increasing sequence. Then its terms are smaller 
than some number B and, since x 1 ^ x n for all n , they lie in the interval x ± ^ x n ^ B , 
which will be denoted by 7 0 . We bisect /o; that is, we subdivide it into two parts of equal 
length. If the right half (together with its endpoints) contains terms of the sequence, we 
denote it by I v If it does not contain terms of the sequence, then the left half of I 0 (together 
with its endpoints) is called I v This is the first step. 

In the second step we bisect 7 1? select one half by the same rule, and call it 7 2 , and so 
on (see Fig. 557 on p. A70). 

In this way we obtain shorter and shorter intervals hi hi • • • with the following 
properties. Each 7 m contains all 7 n for n > m. No term of the sequence lies to the right 
of 7 ?n , and, since the sequence is monotone increasing, all x n with n greater than some 
number N lie in 7 W ; of course, N will depend on ///, in general. The lengths of the l m 
approach zero as m approaches infinity. Hence there is precisely one number, call it L, 
that lies in all those intervals, 3 and we may now easily prove that the sequence is 
convergent with the limit L. 

In fact, given an € > 0, we choose an m such that the length of 7 m is less than e. Then 
L and all the x n with n > N(m) lie in 7 m , and, therefore, |A’ n — L\ < e for all those n. 
This completes the proof for an increasing sequence. For a decreasing sequence the proof 
is the same, except for a suitable interchange of “left” and “right” in the construction of 
those intervals. ■ 


3 This statement seems to be obvious, but actually it is not: it may be regarded as an axiom of the real number 
system in the following form. Let y 2 , * * * be closed intervals such that each J m contains all J n with n > m , 
and the lengths of the J m approach zero as m approaches infinity. Then there is precisely one real number that 
is contained in all those intervals. This is the so-called Cantor-Dedekind axiom, named after the German 
mathematicians GEORG CANTOR (1845-1918), the creator of set theory, and RICHARD DEDEKIND 
(1831-1916). known for his fundamental work in number theory. For further details see Ref. [GR2] in App. I. 
(An interval / is said to be closed if its two endpoints are regarded as points belonging to /. It is said to be open 
if the endpoints are not regarded as points off.) 
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THEOREM 2 


PROOF 


r 


0 

1 

*1 .V 2 *3 

1 1 1 1 L 

i 1 minim 


B 

1 

7 



r y 

T 







Fig. 557. Proof of Theorem 1 


Real Series 


Leibniz Test for Real Series 

Let x lt Jt* 2 , • • • be real and monotone decreasing to zero , that is, 

(1) (a) Xx ^ *2 ^ ^ • • • , (b) lim A* m = 0. 

Tilr-rOC 

Then the series with terms of alternating signs 

A'i — a *2 + a 3 — *4 H — • * • 

converges, and for the remainder R n after the nth term we have the estimate 

(2) \R n \ ^ W 


Let s n be the /7th partial sum of the series. Then, because of (la), 

Sx = A'x, *2 = x l ” *2 = 

•$3 = s 2 -*3 — s 2i s 3 = s l ~ ( x 2 ~ * 3 ) — 

so that s 2 = s% ^ s x - Proceeding in this fashion, we conclude that (Fig. 558) 

(3) Sx ^ $3 = *5 = * • * = *6 = ^4 = ^2 

which shows that the odd partial sums form a bounded monotone sequence, and so do the 
even partial sums. Hence, by Theorem 1, both sequences converge, say, 

lim s 2n +i = lim s 2n = s*. 

?i— »cc n— * oc 


S 2 S 4 S 3 «1 

Fig. SS8. Proof of the Leibniz test 
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Now, since s 2n +i ~ s 2n = A* 2n +i> we readily see that (lb) implies 

s- s* = lim s 2n+ 1 “ lim s 2n = lim (s 2n+1 - s 2n ) = lim -v 2n+1 = 0. 

n— *oc n — >oc n—*zc n — *-oc 

Hence .v* = and the series converges with the sum .v. 

We prove the estimate (2) for the remainder. Since s n s , it follows from (3) that 

s 2 n+i ^ ^ s 2n and also s 2n _i ^ s ^ s 2n . 

By subtracting s 2n and s 2n _ x , respectively, we obtain 

*^2n+i S **2 n — 0* 0 = 5““ $2n— 1 — ^2n ~~ $2n—l m 

In these inequalities, the first expression is equal to jc 2n . fl , the last is equal to — x 2n , and 
the expressions between the inequality signs are the remainders R 2n and R 2n - Thus the 
inequalities may be written 


x 2n + l — ^2n = 0. 0 = ^2n-l — x 2n 

and we see that they imply (2). This completes the proof. 


A3.4 Grad, Div, Curl, V 2 

in Curvilinear Coordinates 

To simplify formulas we write Cartesian coordinates a* = a* 1? y = .v 2 , z = a* 3 . We denote 
curvilinear coordinates by q x , q 2 , q 3 . Through each point P there pass three coordinate 
surfaces q x = const, q 2 — const , q 3 = const. They intersect along coordinate curves. We 
assume the three coordinate curves through P to be orthogonal (perpendicular to each 
other). We write coordinate transformations as 

(1) A*! = A* x (<h, r/ 2 , q 3 ), A' 2 = x 2 (q x , q 2 , <r/ 3 ), a * 3 = x 3 (q x , q 2 , q 3 ). 

Corresponding transformations of grad, div, curl, and V 2 can all be written by using 



Next to Cartesian coordinates, most important are cylindrical coordinates q x = /*, q 2 = ft 
q 3 = z (Fig. 559a on p. All) defined by 

(3) x x = q x cos q 2 = /* cos ft a* 2 = q x sin q 2 = r sin ft x 3 = q 3 = z 
and spherical coordinates q x = r, q 2 = ft q 3 = (Fig. 559b) defined by 4 

(4) Xl ~ ^ C0S S * n = r C0S ® sm ^ *2 = ^ sin sin ^3 = r sin ® sin 

a 3 = ( 7 X cos q 3 = r cos <f>. 


^Tliis is the notation used in calculus and in many other books. It is logical since in it, 0 plays the same role 
as in polar coordinates. CAITION! Some books interchange the roles of 0 and d>. 
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(a) Cylindrical coordinates (6) Spherical coordinates 

Fig. 559. Special curvilinear coordinates 


In addition to the general formulas for any orthogonal coordinates q 2 , q& we shall give 
additional formulas for these important special cases. 

Linear Element ds. In Cartesian coordinates, 

ds 2 = dx? + dxi + dx 3 2 (Sec. 9.5). 

For the ^-coordinates, 

(5) ds 2 = h 2 dq 2 + hi dq 2 2 + h 3 2 dqi . 

(5') ds 2 = dr 2 + r 2 dfP + dz 2 (Cylindrical coordinates). 

For polar coordinates set dz 2 = 0. 

(5") ds 2 = dr 2 + r 2 sin 2 <j> d(F + r 2 d<f> 2 (Spherical coordinates). 

Gradient, grad / = V/ = [f Xj , f Xz , f x J (partial derivatives; Sec. 9.7). In the 
^-system, with u, v, w denoting unit vectors in the positive directions of the q x , q 2 , q 3 
coordinate curves, respectively. 


( 6 ) 


( 6 ') 

( 6 ") 


grad / = V/ = 


hi 


Of i df l df 

u + v + — 

dq i «2 dq 2 h 2 dq 2 


grad / = V.f = 


df l W df 

— u -I — v + — w 

Or r d$ dz 


df 

grad/ = V/ = — u + 
dr 


1 df . 1 df 

— : — T TT v H — w 

rsin<£ 09 r 6<f> 


(Cylindrical coordinates) 
(Spherical coordinates). 


Divergence div F = V*F = (F 1 ) x> + (F 2 ) X2 + (F 3 ) X3 (F = [F lt F 2 , F 3 ], Sec. 9.8); 

(7) divF - *- F = [4r <W'> + i + i <w-,>] 


O') divF = V-F + + 

r dr r 06 dz 


(Cylindrical coordinates) 
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(7") div F = V • F = “S' T~ 0' 2 Fi) + — r— r + — r— r tt (sin <£ Fa) (Spherical coordinates). 

v ' r* dr 1 rsm <f> dd rsm<f> d<j> * r 


Laplacian V 2 / = V*V/ = div (grad/) = /*,*, + f x ^ 2 + f x ^ 3 (Sec. 9.8): 

(8) V 2 f 1 f a /M a a/\, a (Mi j/_\ 4 j j/ \1 

/?i/; 2 /?3 |_ \ fti ^<?i / \ h 2 dq z ) dq 3 \ A3 3 <73 / J 


.2. _ » f ^ 1 v'*!// 


(S') v 2 / = ^4- + -^ + - ? ^ + ^- 

flr 2 r dr ,- 2 a# 2 d; 2 


(Cylindrical coordinates) 


a 2 / 2 a/ 1 a 2 / 1 a 2 / cot<*> a/ 


„ A 17 J WJ ■ v j 1 v j wt y VI 

(8 ) V/ = — H 1 — 5 — 5 “2 H — 5 * — jr "i 2 — TT 

Or 2 r dr r 2 sin 2 <f> d#* r 2 r ?<£ 2 r 2 

Curl (Sec. 9.9): 


— (Spherical coordinates). 


/liU 

/2 2 V 

h 3 w 

3 

a 

3 

ctyi 

dCJ 2 


/h/ 7 ! 


V 7 ® 


curl F = V X F = 


For cylindrical coordinates we have in (9) (as in the previous formulas) 

*1 = K = K 1*2 = 1*0 =<71 = /*, *3 = ^ = 

and for spherical coordinates we have 


/?! = /j r = 1. 


/<2 ~ h u = q x sin <73 = r sin <f>. 


1*3 = f*4 = <?! = /*. 



APPENDIX 4 

Additional Proofs 

Section 2.6, page 73 

PROOF OF THEOREM 1 Uniqueness 1 

Assuming that the problem consisting of the ODE 

(I) y" + p(x)y' + q(x)y = 0 
and the two initial conditions 

(3) y(x 0 ) = K 0 , /(*o) = 

has two solutions y±(x) and y 2 (x) on the interval I in the theorem, we show that their 
difference 

y(x) = y^x) - y 2 (x) 

is identically zero on I; then y 1 = y 2 on I, which implies uniqueness. 

Since (1) is homogeneous and linear, y is a solution of that ODE on /, and since y x and 
y 2 satisfy the same initial conditions, y satisfies the conditions 

(10) y(x 0 ) = 0, /(*o) = 0. 

We consider the function 

z(x) = y(xf + y'(xf 

and its derivative 

= 2yy' + ly'y". 

From the ODE we have 

// / 
y = -py - qy- 

By substituting this in the expression for z we obtain 

(II) Z = 2 yy' - 2 py' 2 - 2 qyy' . 

Now, since y and y are real, 

(y ± y') 2 = y 2 ± 2 yy' + y 2 ^ 0. 

1 This proof was suggested by my colleague, Prof. A. D. Ziebur. In this proof we use formula numbers that 
have not yet been used in Sec. 2.6. 
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From this and the definition of z we obtain the two inequalities 

(12) (a) 2yy' S/ + y' 2 = 2 , (b) -2 lyy' § .v 2 + /* = z. 

From (12b) we have 2yy' ^ —z. Together, |2yy'] S z. For the last term in (11) we now 
obtain 

“2 qyy' S \~2qyy'\ = M|2yy'| S M?- 

Using this result as well as —p = |p| and applying (12a) to the term 2 yy' in (1 1), we find 

t'=z + 2|/?|/ 2 + fo|z. 

Since y' 2 = y 2 + y' 2 = 2 , from this we obtain 

z' s <1 + 2|p| + M)z 

or, denoting the function in parentheses by h, 

(13a) z' § hz for all jc on 1. 

Similarly, from (11) and (12) it follows that 


(13b) 


-z = -2 yy' + 2 py' 2 + 2 qyy' 
^ z + 2\p\z + \q\z = hz- 


The inequalities (13a) and (13b) are equivalent to the inequalities 
(14) z — hz = 0, z + hz = 0. 


Integrating factors for the two expressions on the left are 

F 1 = e ~ fMx) ** and F 2 = e fMx) dx . 

The integrals in the exponents exist because h is continuous. Since F t and F 2 are positive, 
we thus have from (14) 


F^z - hz) = (F 1 z)' = 0 and F 2 {z + hz) = (F 2 z)' = 0. 

This means that F x z is nonincreasing and F 2 z is nondecreasing on /. Since z(x 0 ) = 0 by 
(10), when x = x 0 we thus obtain 


Fi 2 ^ (F lZ ) Xo = 0, F 2 z S (F 2 z) Xo = 0 
and similarly, when x = x 0 , 

F iZ g 0, F 2 z § 0. 

Dividing by F 1 and F 2 and noting that these functions are positive, we altogether have 

2 = 0, 2 S 0 

This implies that z = y 2 + y' 2 = 0 on I. Hence y = 0 or y x = y 2 on /. 


for all x on I. 
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Section 5.4, pages 184 

PROOF OF THEOREM 2 Frobenius Method. Basis of Solutions. Three Cases 

The formula numbers in this proof are the same as in the text of Sec. 5.4. An additional 
formula not appearing in Sec. 5.4 will be called (A) (see below). 

The ODE in Theorem 2 is 


0 ) 


n . 

y + 


b(x) 

x 


y + 



y = o, 


where b( x) and c(x) are analytic functions. We can write it 
(1 ') x 2 y" + xb(x)y' + c(x)y = 0. 

The indicial equation of (1) is 

(4) r(r - 1) + b 0 r + c 0 = 0. 

The roots r lf r 2 of this quadratic equation determine the general fomi of a basis of solutions 
of (1), and there are three possible cases as follows. 

Case 1. Distinct Roots not Differing by an Integer. A first solution of (1) is of the 
form 

(5) y x (x) = x r * (a 0 + a x x + a^ 2 + • • •) 

and can be determined as in the power series method. For a proof that in this case, the 
ODE (1) has a second independent solution of the form 

(6) v 2 (a) = x r * (A 0 + A x x -1- A 2 x 2 + •••)» 
see Ref. [All! listed in App. 1. 

Case 2. Double Root. The indicial equation (4) has a double root r if and only if 
( b 0 — l) 2 — 4c 0 = 0, and then r = |(1 — b 0 ). A first solution 

(7) y t (x) = x r (a 0 + a x x 4- a 2 x 2 +••*), r = |(1 - b 0 ), 

can be determined as in Case 1 . We show that a second independent solution is of the 
form 

(8) y 2 (x) = y x (x) In a* + x r (A x x + A 2 a 2 + • • •) (a > 0). 

We use the method of reduction of order (see Sec. 2.1), that is, we determine m(a) such 
that y 2 (x) = u(x)y 1 (x) is a solution of (1). By inserting this and the derivatives 

y*2 = u'yi + uy[ 9 yl = U f y x + 2uy[ + uy’[ 


into the ODE (1 ; ) we obtain 

A 2 (w"yi + 2 uy[ + uy") + xb(u'y l + uy[) + cuy x = 0. 
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Since y\ is a solution of (l '), the sum of the terms involving u is zero, and this equation 
reduces to 


x 2 y x u " -I- -1- x :by 1 u t = 0. 

By dividing by x 2 )^ and inserting the power series for b we obtain 

II* + (2 — + ^ 0. 

\ .Vl * / 


Here and in the following the dots designate terms that are constant or involve positive 
powers of x. Now from (7) if follows that 

A _ •x r ~ 1 [ra 0 + (/• -1- 1)0^ + • • •] 
y\ Jc’fao + «i* + ■ • •] 


1 / ra 0 4- (r 4 * 1 )#^ 4- • • • \ 
X \ a Q Hh x 4- • * • / 


a 0 "b a i x "b 

Hence the previous equation can be written 

2 r 4- b Q 


(A) 


H , ( 2r + b 0 . \ , 


r 

= - + •••. 
A' 


= 0. 


Since r = (1 - Z? 0 )/2, the term (2 r 4- Z? 0 )/jc equals 1 /jc, and by dividing by u we thus 
have 



By integration we obtain In 11 = -lnx 4- • • • , hence u = (l/*)<? ( ‘ * Expanding the 
exponential function in powers of x and integrating once more, we see that u is of the 
form 

u = In A’ 4- kiX 4* k 2 x 2 4* ■ • • . 


Inserting this into y 2 = w .Vi» we obtain for y 2 a representation of the form (8). 

Case 3. Roots Differing by an Integer. We write r x = r and r 2 = r — p where p is a 
positive integer. A first solution 

(9) y x (x) = x ri (a Q 4- a x x 4- a 2 x 2 4- * • •) 

can be determined as in Cases 1 and 2. We show that a second independent solution is 
of the form 

(10) y 2 (x) = ky x ix) In a: 4- A* r2 (A 0 + A x x 4- A 2 a* 2 4- • • •) 

where we may have k * 0 or k = 0. As in Case 2 we set y 2 = uy x . The first steps are 
literally as in Case 2 and give Eq. (A), 
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Now by elementary algebra, the coefficient b 0 — 1 of r in (4) equals minus the sum of 
the roots, 

b 0 “ 1 = 0*i + r 2 ) = -(r + r - p) = -2r + p. 

Hence 2r + b 0 = p + 1, and division by u 9 gives 

+ 1 




The further steps are as in Case 2. Integrating, we find 
In u = — (/; + 1) In * + • • • , thus 


/ -(p i-i) (• • •> 

u = x e 


where dots stand for some series of nonnegative integer powers of a*. By expanding the 
exponential function as before we obtain a series of the form 


/ 

u 





x 


i * * ’ • 


We integrate once more. Writing the resulting logarithmic term first, we get 

m = £p In* + (— “F ^ + k P+i x +•••)• 

Hence, by (9) we get for y 2 — uy x the formula 

) J 2 = 10 x + ^ ri ” P “ “ • • • — kp-jx*" 1 + • • • j (a 0 + a ± x + • • •)• 

But this is of the form (10) with k = k p since r x — p — r 2 and the product of the two 
series involves nonnegative integer powers of x only. ■ 


Section 5.7, page 205 


THEOREM 


Reality of Eigenvalues 

Ifp, q, r, and p in the Sturm-Liouville equation (1) of Sec. 5.7 are real-valued and 
continuous on the intetyal a ^ x ^ b and r(x) > 0 throughout that interval ( or 
r(x) < 0 throughout that interval ), then all the eigenvalues of the Sturm-Liouville 
problem (1), (2), Sec. 5.7, are real 


PROOF Let A = a + i/3 be an eigenvalue of the problem and let 


y(x) = u(x) -I- iv(x) 

be a corresponding eigenfunction; here a y u, and v are real. Substituting this into (1), 
Sec. 5.7, we have 

(pu l + ipv f ) f + (q + ar + ij3r)(u + iv) = 0. 
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This complex equation is equivalent to the following pah* of equations for the real and 
the imaginary parts: 


(pu'Y + (q 4- ar)u — firv = 0 
( pv')' + (q + ar)v + pru = 0. 

Multiplying the first equation by u, the second by —u and adding, we get 

-/3 (w 2 + v 2 )r = uipv'Y - v{pu f Y 
= [( pv)u - 


The expression in brackets is continuous on a ^ x ^ b, for reasons similar to those in 
the proof of Theorem 1, Sec. 5.7. Integrating over x from a to b, we thus obtain 



u 2 )r dx = 



Because of the boundary conditions the right side is zero; this is as in that proof. Since y 
is an eigenfunction, u 2 + v 2 ^ 0. Since y and r are continuous and r > 0 (or /* < 0) on 
the interval a ^ x ^ b, the integral on the left is not zero. Hence, p = 0, which means 
that A = a is real. This completes the proof. ■ 


Section 7.7, page 308 


Determinants 




The definition of a determinant 





All 

U\2 * * ' 

a l n 



^21 

o 22 * 6 

a 2n 


(7) D = det A = 

• 

. ... 

* 



&nl 

a n2 

Ann 


as given in Sec . 7.7 is unambiguous , 

that is, it yields the same value of D no matter 

which rows or columns we choose in 

developings . 



PROOF In this proof we shall use formula numbers not yet used in Sec. 7.7. 

We shall prove first that the same value is obtained no matter which row is chosen. 

The proof is by induction. The statement is true for a second-order determinant, for 
which the developments by the first row a Y1 a 2 2 + a 12 (-a 21 ) and by the second row 
^ 2 i( ^ 12 ) + ^ 22^11 give the same value a n a 22 - a 12 a 21 . Assuming the statement to be 
true for an (n - l)st-order determinant, we prove that it is true for an nth-order 
determinant. 
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For this purpose we expand D in terms of each of two arbitrary rows, say, the ith and 
the 7 th, and compare the results. Without loss of generality let us assume i < j. 

First Expansion . We expand D by the /th row. A typical term in this expansion is 

(19) a ik C ik = a ik -{-\f +k M ik . 

The minor M ^ of a ik in D is an (n — l)st-order determinant. By the induction hypothesis 
we may expand it by any row. We expand it by the row corresponding to the yth row of 
D. This row contains the entries a# (/ =£ k). It is the (J — I )st row of M ik , because M ik 
does not contain entries of the ith row of D, and / < j. We have to distinguish between 
two cases as follows. 

Case /. If / < fc, then the entry a# belongs to the /th column of M ik (see Fig. 560). Hence 
the term involving in this expansion is 

(20) a jt • (cofactor of a jt in M ik ) = a jt • (- 

where M ikj i is the minor of a# in M ik . Since this minor is obtained from M ik by deleting 
the row and column of aj h it is obtained from D by deleting the ith and yth rows and the 
fcth and Ith columns of D. We insert the expansions of the M ik into that of D. Then it 
follows from (19) and (20) that the terms of the resulting representation of D are of the 
form 

(21a) a ik a n -{-\) b M ikjl (l < k) 

where 

b = / + k + j + / — 1. 

Case II. If / > k , the only difference is that then a jt belongs to the (/ — 1 )st column of 
M ik , because M ik does not contain entries of the kt h column of D, and k < I. This causes 
an additional minus sign in ( 20 ), and, instead of ( 21 a), we therefore obtain 

( 21 b) (/>*) 

where b is the same as before. 


/th kth /eth /th 

col. col. col. col. 


1 1 

i -4)- - 

1 1 

ith row 

.-i— 4— 

1 1 

I 1 

jt h row 

i i 

i i 

i i 


Case I Case II 

Fig. 560. Cases I and II of the two expansions of D 
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Second Expansion. We now expand D at first by the 7 th row. A typical term in this 
expansion is 

( 22 ) cijtC^ = 

By the induction hypothesis we may expand the minor M jt of a jt in D by its ith row, which 
corresponds to the ith row of D, since j > i. 

Case /. If k > /, the entry a ik in that row belongs to the ( k - I )st column of M jh because 
Mji does not contain entries of the / th column of D, and / < k (see Fig. 560). Hence the 
term involving a ik in this expansion is 

(23) a ik ■ (cofactor of a ik in M jt ) = a ik • (- 

where the minor M ikjl of a ik in is obtained by deleting the /th and j th rows and the 
kth and /th columns of D [and is, therefore, identical with M ik j t in (20), so that our notation 
is consistent]. We insert the expansions of the M jt into that of D. It follows from (22) and 
(23) that this yields a representation whose terms are identical with those given by ( 21 a) 
when / < k. 

Case II. If & < /. then a ik belongs to the kth column of Mj h we obtain an additional minus 
sign, and the result agrees with that characterized by ( 21 b). 

We have shown that the two expansions of D consist of the same terms, and this proves 
our statement concerning rows. 

The proof of the statement concerning columns is quite similar; if we expand D in 
terms of two arbitrary columns, say, the A;th and the / th, we find that the general term 
involving aj } a lk is exactly the same as before. This proves that not only all column 
expansions of D yield the same value, but also that their common value is equal to the 
common value of the row expansions of D. 

This completes the proof and shows that our definition of an nth-order determinant is 
unambiguous. ■ 

Section 9.3, page 377 

IPROOF OF FORMULA (2) 

We prove that in right-handed Cartesian coordinates, the vector product 
v = a X b = [flx, a 2 , « 3 ] x |>i, b 2 , b z ] 


has the components 

(2) = Q 2 b 2 ci z b 2 * v 2 — o 2 bi ci}b z . v 2 — ci\b 2 ci 2 bi- 

We need only consider the case v ¥= 0. Since v is perpendicular to both a and b, Theorem 
1 in Sec. 9.2 gives a • v = 0 and b • v = 0: in components [see (2), Sec. 9.2], 


( 3 ) 


ctxVi + a 2 v 2 + a 3 v 3 = 0 
lhv x + b 2 v 2 + b 3 v 3 = 0. 
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Multiplying the first equation by b 3 , the last by a 3 , and subtracting, we obtain 


{a z b x - a l b 3 )v 1 = (a 2 b 3 - a 3 b 2 )u 2 . 


Multiplying the first equation by the last by %, and subtracting, we obtain 


(ci\b 2 £7 2 ^l) y 2 “ ( a 3^1 fl 1^3) u 3- 


We can easily verify that these two equations are satisfied by 

(4) Vi = c(ci 2 b 3 — ci 3 b 2 )j v 2 = c(ci 3 bi — tfi& 3 ), v 3 — c(ciib 2 ~~ u 2 bi) 

where c is a constant. The reader may verify by inserting that (4) also satisfies (3). Now 
each of the equations in (3) represents a plane through the origin in UiU 2 y 3 " s P ace - The 
vectors a and b are normal vectors of these planes (see Example 6 in Sec. 9.2). Since 
v t* 0, these vectors are not parallel and the two planes do not coincide. Hence their 
intersection is a straight line L through the origin. Since (4) is a solution of (3) and, for 
varying c, represents a straight line, we conclude that (4) represents L , and every solution 
of (3) must be of the form (4). In particular, the components of v must be of this form, 
where c is to be determined. From (4) we obtain 

|v| 2 = v z + v 2 z + v 3 2 = c z [(a 2 b 2 - a 2 b 2 ) z + (ci z b l - a x b 2 ) z + (a^ ~ a 2 *i) 2 ]- 
This can be written 

|v| 2 = c 2 [(tfx 2 + a 2 2 + a 3 2 )(b x 2 4- b 2 2 4 b 3 2 ) - 4- a 2 b 2 + <2 3 Z? 3 ) 2 ], 

as can be verified by performing the indicated multiplications in both formulas and 
comparing. Using (2) in Sec. 9.2, we thus have 

|v| 2 = c 2 [( a • a)(b • b) - (a • b) 2 ]. 

By comparing this with formula (12) in Team Project 24 of Problem Set 9.3 we conclude 
that c = ±1. 

We show that c = 4-1. This can be done as follows. 

If we change the lengths and directions of a and b continuously and so that at the end 
a = i and b = j (Fig. 186a in Sec. 9.3), then v will change its length and direction 
continuously, and at the end, v = i x j = k. Obviously we may effect the change so that 
both a and b remain different from the zero vector and are not parallel at any instant. 
Then v is never equal to the zero vector, and since the change is continuous and c can 
only assume the values +1 or —1, it follows that at the end c must have the same value 
as before. Now at the end a = i, b=j, v = k and, therefore, a ± = 1, b 2 = 1, v 3 = 1, 
and the other components in (4) are zero. Hence from (4) we see that v 3 = c = 4-1. This 
proves Theorem 1 . 

For a left-handed coordinate system, i x j = -k (see Fig. 186b in Sec. 9.3), resulting 
in c = - 1. This proves the statement right after formula (2). ■ 
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Section 9.9, page 416 

PROOF OF THE INVARIANCE OF THE CURL 

This proof will follow from two theorems (A and B), which we prove first. 


THEOREM A 


Transformation Law for Vector Components 

For any vector v the components i/ lt V 2 , v 3 and Vi*, v 2 *, v 3 * in any two systems 
of Cartesian coordinates x l5 x 3 , x 3 and x x *, x 2 *, x 3 *, respectively, are related by 


Vi* = c n u x + c 12 v 2 + c 13 v 3 

( 1 ) U 2 * = C 2 iV 1 + C 22 v 2 + C353U3 

v 3 * = C31U1 + c 32 v 2 + C33W3. 

and conversely 

i>i = c n°i* + c 21 v 2 * + c 31 v 3 * 

( 2 ) v 2 = c-^Vy* + c 22 v 2 * + c 32 v 3 * 

V 3 = C 13 Vi* + C 2Z V 2 * + C33U3* 

with coefficients 

c u = i*»i c 12 = i*\j c 13 = i**k 

(3) c 21 = j**i c 22 = j**j C 23 = j^’k 
c 31 = k*»i c 32 = k*»j c 33 = k*k 

satisfying 

3 

(4) 2) ~ & km m U 2, 3), 

3=1 

where the Kronecker delta 2 is given by 


fO ( k + m) 

§ km = I 

Ll (Ic = m) 

and i, j, k and i*, j*, k* denote the unit vectors in the positive av, a 2 -, x 3 - and 
Jfe*-, x 3 * -directions, respectively. 


2 LEOPOLD KRONECKER (1823-1891), German mathematician at Berlin, who made important 
contributions to algebra, group theory, and number theory. 

We shall keep our discussion completely independent of Chap. 7, but readers familiar with matrices should 
recognize that we are dealing with orthogonal transformations and matrices and that our present theorem 
follows from Theorem 2 in Sec. 8.3. 
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PROOF 


THEOREM B 


The representation of v in the two systems are 

(5) (a) v = uj + v 2 j + t> 3 k (b) v = Vi*i* + i> 2 *j* + t> 3 * k*. 

Since i* • 1* = 1, I* • j* = 0, i* • k* = 0, we get from (5b) simply i* • v = vf and 
from this and (5a) 


y x * = i* • v = i* • Uji + i* • v 2 , j + i* • u 3 k = u x i* • i + u 2 i* • j + U 3 I* • k. 

Because of (3), this is the first formula in (1), and the other two formulas are obtained 
similarly, by considering j* • v and then k* • v. Formula (2) follows by the same idea, 
taking i • v = from (5a) and then from (5b) and (3) 


v x = i • v = Uj*i • i* + u 2 *i • j* + u 3 *i • k* = c xl v x * + c 21 u 2 * + c 31 u 3 *. 


and similarly for the other two components. 

We prove (4). We can write (1) and (2) briefly as 

3 3 

(6) (a) Vj = X c„vV m *, (b) v k * = 2 c^Vj. 

m = 1 j = 1 

Substituting Vj into v k *, we get 

33 3/3 \ 

Vk* 2 Vfcj 2 c mjVm* 2 V m ' I 2) ^kj c raj J » 

j“ 1 m=l m-l \j=l / 

where k = 1, 2, 3. Taking k = 1, we have 

t>!* = Vj* 1 2 C U C wj + y 2* 1 2 C U C 2jj + ^3* ^X C U C 3j 

For this to hold for every vector v, the first sum must be 1 and the other two sums 0. This 
proves (4) with k = 1 for m = 1, 2, 3. Taking k = 2 and then k = 3, we obtain (4) with 
k = 2 and 3, for m = 1 , 2, 3. ■ 


Transformation Law for Cartesian Coordinates 

The transformation of any Cartesian x^x^-coordinate system into any other 
Cartesian x - l * x 2 * x 2 * -coordinate system is of the form 

3 

(7) ^ 1 1 2, 3, 

j=l 

with coefficients (3) and constants b lt b 2 , b 3 ; conversely , 

3 

( 8 > *k = X + b k , k= 1 , 2 , 3. 

n=l 
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Theorem B follows from Theorem A by noting that the most general transformation of a 
Cartesian coordinate system into another such system may be decomposed into a 
transformation of the type just considered and a translation; and under a translation, 
corresponding coordinates differ merely by a constant. 


PROOF OF THE INVARIANCE OF THE CURL 

We write again x l9 x 2 , x 3 instead of x , >\ z, and similarly jtj*, x 2 *, * 3 * for other Cartesian 
coordinates, assuming that both systems are right-handed. Let a v a 2f a 3 denote the 
components of curl v in the XjA^-coordinates, as given by (1), Sec. 9.9, with 


* = *i, y = x 2 , z = x 3 . 

Similarly, let a 2 * y ci 3 * denote the components of curl v in the Ai**^*^* -coordinate 
system. We prove that the length and direction of curl v are independent of the particular 
choice of Cartesian coordinates, as asserted. We do this by showing that the components 
of curl v satisfy the transformation law (2), which is characteristic of vector components. 
We consider a x . We use (6a), and then the chain rule for functions of several variables 
(Sec. 9.6). This gives 


dv 3 _ dv 2 
dx 2 dx 3 


3 3/ 

m=l j=l ' 


4, / dv m* _ dV m * 

Cm2 ~exT 

7? i—l 

dv„* dxf _ dv m * dxf \ 
dxf dx 2 Cm2 dxf ~dxf ) 


From this and (7) we obtain 


3 3 

a l ~ S 2 ( c m3 c j2 — c m2 c j3) 

771 =*1 


dxf 


(dvf _ c>u 2 *\ 

- (C33C22 “ C32C23) \-faf J 


+ • • • 


“ ( c 33 c 22 c 32 c 2s) a l* ( c 13 c 32 — c 12 c 33) a 2* + ( c 23 c 12 — c 22 c l?i) a Z* ■ 


Note what we did. The double sum had 3X3 = 9 terms, 3 of which were zero (when 
m = j), and the remaining 6 terms we combined in pairs as we needed them in getting 
#!*, af, af. 


We now use (3), Lagrange’s identity (see Team Project 24 in Problem Set 9.3) and 
k* x j* = -i* and k x j = -I. Then 


C Z3 C 22 ~ C 32 C 23 = (k* • k)(j* • j) - (k* • j)(J* • k) 
= (k* x j*) • (k x j) = i* • i = Cll . 


etc. 
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Hence a x = c u fl!* + c 2 xa 2 * + £ 31 ^ 3 *- This is of the form of the first formula in (2) in 
Theorem A, and the other two formulas of the form (2) are obtained similarly. This proves 
the theorem for right-handed systems. If the JCiJc 2 Jt 3 -coordinates are left-handed, then 
k x j = +i, but then there is a minus sign in front of the determinant in (1), Sec. 9.9. ■ 

Section 10.2, pages 426-427 

IPROOF OF THEOREM 1, PART (b) We prove that if 


(1) f F(r) • dr = f (F x dx + F 2 dy + F 3 dz) 

J c J c 

with continuous F 1? F 2 , F 3 in a domain D is independent of path in D, then F = grad / 
in D far some /; in components 


(2') 



Fo = 


V 

ay 5 


^3 = 


V 

dz 


We choose any fixed A: (. x 0 , y 0 , £0) in & and an y B: (*» y> z) in ^ and define f by 
(3) f(x , y, z) = f 0 + / + F 2 c/y* + F 3 dz*) 

J A 


with any constant / 0 and any path from A to B in D. Since A is fixed and we have 
independence of path, the integral depends only on the coordinates jc, y, z> so that (3) 
defines a function /(jc, y, z) in D. We show that F = grad / with this /, beginning with 
the first of the three relations (2'). Because of independence of path we may integrate 
from A to (jca, y, z) and then parallel to the jc-axis along the segment B X B in Fig. 561 
with B l chosen so that the whole segment lies in D. Then 


r Bl r B 

/(Jc, y, z) = /o + (Fa + F 2 dy * + F 3 + F 2 </y* + F 3 dz*l 

J A 


We now take the partial derivative with respect to jc on both sides. On the left we get 
df/dx. We show that on the right we get F x . The derivative of the first integral is zero 
because A; ( x 0 , y<>, Zo) and B t : ( x i, y, z) do not depend on jc. We consider the second 
integral. Since on the segment B x B y both y and z are constant, the terms F 2 dy* and 



Fig. 561. Proof of Theorem 1 
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F 3 dz* do not contribute to the derivative of the integral. The remaining part can be written 
as a definite integral, 

f F x dx* = f FjCx* y, z) r/.v*. 

j b { J x x 

Hence its partial derivative with respect to x is F x (x, y, z), and the first of the relations 
(2') is proved. The other two formulas in (2') follow by the same argument. ■ 

Section 13.4, page 620 

PROOF OF THEOREM 2 Cauchy-Riemann Equations 
We prove that Cauchy-Riemann equations 

(1) u x = v y , u y = -u x 

are sufficient for a complex function f(z) = u( a, y) 4- iv(x, y) to be analytic; precisely, if 
the real part u and the imaginary part u of f(z) satisfy (1 ) in a domain D in the complex 
plane and if the partial derivatives in ( I) are continuous in D, then /(z) is analytic in D. 

In this proof we write A z = Aa + /Ay and A / = /(z 4- Az) — /(z). The idea of proof 
is as follows. 

(a) We express Af in terms of first partial derivatives of u and u, by applying the mean 
value theorem of Sec. 9.6. 

(b) We get rid of partial derivatives with respect to y by applying the Cauchy-Riemann 
equations. 

(c) We let Az approach zero and show that then AfIAz as obtained approaches a limit, 
which is equal to u x + iv x , the right side of (4) in Sec. 13.4, regardless of the way of 
approach to zero. 

(a) Let P: (a, y) be any fixed point in D. Since D is a domain, it contains a neighborhood 
of P. We can choose a point Q: (a + Aa, y + Ay) in this neighborhood such that the 
straight-line segment PQ is in D . Because of our continuity assumptions we may apply 
the mean value theorem in Sec. 9.6. This yields 

u(x + A a, y 4- Ay) - u( a, y) = (Ax)u x (M 1 ) + (A y)u v (M 1 ) 
v(x + A a, y 4- Ay) - u(a, y) = (A a )v x (M 2 ) + (A y)v y (M 2 ) 

where Mi and M 2 (# M x in general!) are suitable points on that segment. The first line 
is Re Af and the second is Im A/, so that 

Af = (Ax)u x (M 1 ) 4- {Ay)u y (M 1 ) 4- i[(Ax)v x (M 2 ) 4- (Ay)v y {M 2 )\. 

(b) u y = — v x and v y = u x by the Cauchy-Riemann equations, so that 

Af = (Aa) u x (M l) - (A y)v x (M 1 ) 4- i[(Ax)v x (M 2 ) + {Ay)u x (M 2 )\. 
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Also A z = Ax + l Ay, so that we can write Ax = A z — iAy in the first term and 
Ay = (A z — A x)li = — z(Az — A a) in the second term. This gives 

Af = (A z ~ iAy)u x (JMi) + KAz ~ Ax)v x (M 1 ) + i[(Ax)v x (M^) + (Ay)u^M 2 )]. 

By performing the multiplications and reordering we obtain 

Af = (A z)u x (M\) ~ iAy[u x {M x ) - u x (M 2 )} 

+ i[(Az)v x (M x ) - Ax[v x (M x ) - v x {M 2 )}]. 


Division by A z now yields 

Af /Av iAx 

(A) — = u x (M x ) + iv x (M x ) - -^7 {u x (M x ) - u x (M 2 )} “ {v x (M x ) - v x (M 2 )}- 

(c) We finally let A z approach zero and note that \Ay/Az\ ^ 1 and |Ax/Az| ^ 1 in (A). 
Tlien Q: (x - 1 - Ax , y + Ay) approaches P: ( a \ y), so that M x and M 2 must approach P, 
Also, since the partial derivatives in (A) are assumed to be continuous, they approach 
their value at P. In particular, the differences in the braces { • • • } in (A) approach zero. 
Hence the limit of the right side of (A) exists and is independent of the path along which 
A z —> 0. We see that this limit equals the right side of (4) in Sec. 13.4. This means that 
/(z) is analytic at every point z in D, and the proof is complete. ■ 

Section 14.2, pages 647-648 

GOURSArs PROOF OF CAUCHY’S INTEGRAL THEOREM Goursat proved Cauchy’s 
integral theorem without assuming that f'(z) is continuous, as follows. 

We start with the case when C is the boundary of a triangle. We orient C 
counterclockwise. By joining the midpoints of the sides we subdivide the triangle into 
four congruent triangles (Fig. 562). Let Cj, C n , C nl , C IV denote their boundaries. We 
claim that (see Fig. 562). 

(1) <£ f c/z = <fi f dz + <£ f dz + <fi fdz + <$ f dz. 

J c J Ci J c u J c m J c n r 

Indeed, on the right we integrate along each of the three segments of subdivision in both 
possible directions (Fig. 562), so that the corresponding integrals cancel out in pairs, and 
the sum of the integrals on the right equals the integral on the left. We now pick an integral 
on the right that is biggest in absolute value and call its path C x . Then, by the triangle 
inequality (Sec. 13.2), 


<f f dz 

< 

<fi .f dz + <£ f dz 

+ 

<£ f dz + <fi f dz 

g4 

<fi fdz 

J c 


•'c, J c„ 


J c m J c, v 


J c, 



Fig. 562. Proof of Cauchy’s integral theorem 
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We now subdivide the triangle bounded by C 2 as before and select a triangle of 
subdivision with boundary C 2 for which 



Then 




Continuing in this fashion, we obtain a sequence of triangles 7\, T 2 ,- • with boundaries 
Ci, C 2 , • • • that are similar and such that T n lies in T m when n > m, and 


( 2 ) 



n= 1, 2, 


Let Zo be the point that belongs to all these triangles. Since / is differentiable at z = zo, 
the derivative f'(zo) exists. Let 


(3) 


Kz) = 


fiz) - /fa) 
z ~ Zo 


f'(z o). 


Solving this algebraically for f(z) we have 

f(z) = f(z 0 ) + (z - z 0 )/'(zo) + /j (z)(z - Zo)- 


Integrating this over the boundary C„ of the triangle T n gives 


$ f(z ) rfz = ^ /(zo) dz + (z - zo)f'(zo) dz + h(z)(z - z 0 ) dz ■ 

On C„ Cm c„ 

Since /(z 0 ) and /'(zq) are constants and C n is a closed path, the first two integrals on the 
right are zero, as follows from Cauchy’s proof, which is applicable because the integrands 
do have continuous derivatives (0 and const \ respectively). We thus have 


f f(z)dz = f h(z)(z ~ zo) dz. 

0,i c„ 

Since f'(zo) is the limit of the difference quotient in (3), for given e > 0 we can find a 
S > 0 such that 


(4) 


\h(z)\ < e when \z - Zq| < 8. 


We may now take n so large that the triangle T n lies in the disk |z — Zol < 8. Let L n be 
the length of C n . Then |z — z 0 | < L n for all z on C n and zo in T n . From this and (4) we 
have |/i(z)(z — Zo)| < eL n . The ML-inequality in Sec. 14.1 now gives 


(5) 



fiz ) dz 



h(z)(z - z 0 ) dz 


ZkeL n 'L n = €L*. 


Now denote the length of C by L. Then the path C x has the length Lj = U 2, the path C 2 
has the length = LJ 2 = U 4, etc., and C n has the length L n = L/2 n . Hence 
Li = L 2 /4”. From (2) and (5) we thus obtain 


f 

J c 


f dz\ 4” 


f fdz 


g 4 n eL n 2 = 4"e — = eL 2 . 
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By choosing e (> 0) sufficiently small we can make the expression on the right as small 
as we please, while the expression on the left is the definite value of an integral. 
Consequently, this value must be zero, and the proof is complete. 

The proof for the case in which C is the boundary of a polygon follows from the 
previous proof by subdividing the polygon into triangles (Fig. 563). The integral 
corresponding to each such triangle is zero. The sum of these integrals is equal to the 
integral over C, because we integrate along each segment of subdivision in both 
directions, the corresponding integrals cancel out in pairs, and we are left with the integral 
over C. 

The case of a general simple closed path C can be reduced to the preceding one by 
inscribing in C a closed polygon P of chords, which approximates C “sufficiently 
accurately,” and it can be shown that there is a polygon P such that the integral over P 
differs from that over C by less than any preassigned positive real number e, no matter 
how small. The details of this proof are somewhat involved and can be found in Ref. [D6] 
listed in App. 1. ■ 



Fig. 563. Proof of Cauchy’s integral theorem for a polygon 


Section 15.1, page 667 

PROOF OF THEOREM 4 Cauchy’s Convergence Principle for Series 

(a) In this proof we need two concepts and a theorem, which we list first. 

1. A bounded sequence s 2 , • • • is a sequence whose terms all lie in a disk of 
(sufficiently large, finite) radius K with center at the origin; thus \s n \ < K for all /*. 

2. A limit point a of a sequence s x , s 2 , * * * is a point such that, given an e > 0, there 
are infinitely many terms satisfying |s n — a\ < e. (Note that this does not imply 
convergence, since there may still be infinitely many terms that do not lie within that 
circle of radius e and center a.) 

Example: f, §, §, ii» * ' * has the limit points 0 and 1 and diverges. 

3. A bounded sequence in the complex plane has at least one limit point. 
(Bolzano-Weierstrass theorem; proof below. Recall that “sequence” always mean infinite 
sequence.) 

(b) We now turn to the actual proof that Z\ + Z2 + • * * converges if and only if for 
every e > 0 we can find an N such that 

(1) \in+i + • • • + z n +p\ < € for every n > N and p = 1, 2, • • • . 

Here, by the definition of partial sums, 

Zn + 1 “h * * • + Z n +p . 
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THEOREM 


PROOF 


Writing n + p — r, we see from this that (1) is equivalent to 

(1*) \s r — s n \ < e for all r > N and n > N. 

Suppose that s 2 , * * • converges. Denote its limit by s . Then for a given e > 0 we can 

find an N such that 


k» - s\< 


€ 

2 


for every n > N. 


Hence, if r > N and n > /V, then by the triangle inequality (Sec. 13.2), 

kr - S»l = |(Sr - •*) ~ (*« ~ *)| = W ~ s\ + |.V„ ~ *| < J + J = C, 


that is, ( 1 *) holds. 

(c) Conversely, assume that s 2 , • • • satisfies (1*). We first prove that then the 
sequence must be bounded. Indeed, choose a fixed e and a fixed n = ;? 0 > N in (1*)- 
Then (1*) implies that all s r with r> N lie in the disk of radius e and center s no and only 
finitely many terms s l9 • ■ • , s N may not lie in this disk. Clearly, we can now find a circle 
so large that this disk and these finitely many terms all lie within this new circle. Hence 
the sequence is bounded. By the Bolzano-Weierstrass theorem, it has at least one limit 
point, call it s. 

We now show that the sequence is convergent with the limit s. Let £ > 0 be given. 
Then there is an N* such that |j r - j n | < e/2 for ail r > N* and n > /V*, by (1*). Also, 
by the definition of a limit point, \s n - s\ < e /2 for infinitely many so that we can find 
and fix an n > N* such that |s n — ^| < e/2. Together, for every r > A/*, 


kr - *1 = |Or ~ S n ) + (S„ ~ .9)| S \s r ~ ,9 n | + kn ~ ^1 < J + J = 


that is, the sequence s lf s 2 , * * * is convergent with the limit s. 


Bolzano-Weierstrass Theorem 3 

A bounded infinite sequence z±, Z 3 , * * * in the complex plane has at least one 

limit point . 


It is obvious that we need both conditions: a finite sequence cannot have a limit point, 
and the sequence 1 , 2, 3, • • • , which is infinite but not bounded, has no limit point. To 
prove the theorem, consider a bounded infinite sequence ' and let K be such that 

\z n \ < K for all n. If only finitely many values of the z n are different, then, since the 
sequence is infinite, some number z must occur infinitely many times in the sequence, 
and, by definition, this number is a limit point of the sequence. 

We may now turn to the case when the sequence contains infinitely many different 
terms. We draw a large square Q 0 that contains all Zn- We subdivide Q 0 into four congruent 
squares, which we number l, 2, 3, 4. Clearly, at least one of these squares (each taken 


BERNARD BOLZANO (1781-1848), Austrian mathematician and professor of religious studies, was a 
pioneer in the study of point sets, the foundation of analysis, and mathematical logic. 

For Weierstrass, see See. 1 5.5. 
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with its complete boundary) must contain infinitely many terms of the sequence. The 
square of this type with the lowest number (1, 2, 3, or 4) will be denoted by <2i- This is 
the first step. In the next step we subdivide Q x into four congruent squares and select a 
square Q 2 by the same rule, and so on. This yields an infinite sequence of squares Q 0j 
Q v ‘ ‘ > Qm • * * with the property that the side of Q n approaches zero as n approaches 
infinity, and Q m contains all Q n with n > m. It is not difficult to see that the number 
which belongs to all these squares, 4 call it z = a, is a limit point of the sequence. In fact, 
given an e > 0, we can choose an N so large that the side of the square Q N is less than 
€ and, since Q N contains infinitely many we have |z„ — a\ < e for infinitely many n. 
This completes the proof. ■ 

Section 15.3, pages 681-682 

PART (b) OF THE PROOF OF THEOREM 5 

We have to show that 


^ T (z + A z) n - z n 

2j « n TZ 

n=Z *— - 


= 2 Az[(z + Azf- 2 + 2z(z + A z) n ~ 3 + ••• + («- l)z n " 2 ], 

71=2 


thus. 


(z + Az) n - z n 
Az 


- nz n ~ l 


= A^[(z + Az) M_2 + 2z(z + Az) n “ 3 + ••• + («.- l)z n “ 2 ]. 
If we set z + Az = b and z = a, thus A z — b — a, this becomes simply 

b n - a r 


(7a) 


b — a 


— na n = (b — cfiAn 


(n = 2, 3, • • •), 


where A n is the expression in the brackets on the right, 

(7b) A n = b n ~ 2 + 2 ab n ~ 3 + 3 aV' 4 +••• + («- l)a n " 2 ; 

thus, A 2 — 1, A 3 = b -1- 2a, etc. We prove (7) by induction. When n = 2, then (7) holds, 
since then 


b 2 - a 2 
b — a 


-2 a = 


(b + a)(b ~ a) 
b — a 


— 2a = b — a = (b — a)A 2 . 


Assuming that (7) holds for n = fc, we show that it holds for n = k + 1 . By adding and 
subtracting a term in the numerator and then dividing we first obtain 

b k * i - a k+1 b k+1 - ba k + ba k - a fc+1 b k - a k 
b - a b - a b - a 


4 The fact that such a unique number z = a exists seems to be obvious, but it actually follows from an axiom 
of the real number system, the so-called Cantor-Dedekind axiom: see footnote 3 in App. A3. 3. 
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By the induction hypothesis, the right side equals b[(b — a)A k 4- ka k *] + a k . Direct 
calculation shows that this is equal to 

(b - a){bA k + te fc " 1 } + aka k ~' + a k . 

From (7b) with n = k we see that the expression in the braces {• • •} equals 

b k ~ l + 2 ab k ~ 2 + ••• + <*- 1 )ba k ~ 2 + ka k ~ 1 = A k+l . 


Hence our result is 


b k * i - a k+1 . 

- = (b - a)A k+1 + (k + 1 )a k . 


Taking the last teiTn to the left, we obtain (7) with n = k + 1. This proves (7) for any 
integer n S 2 and completes the proof. ■ 

Section 18.2, page 754 

ANOTHER PROOF OF THEOREM 1 without the use of a harmonic conjugate 

We show that if w = u + iv = f(z ) is analytic and maps a domain D conformally onto 
a domain D* and <£>*(«, v) is harmonic in D*, then 

(1) 3>(x, y) = <J>*(m(x, y), v(x, >•)) 

is harmonic in D, that is, V 2 <& = 0 in D. We make no use of a harmonic conjugate of 
<f>*, but use straightforward differentiauon. By the chain rule, 

<*>* = U x + <t> u * V x . 

We apply the chain rule again, underscoring the terms that will drop out when we form 
V 2 4>: 

®xx = ^u*«xx + (4 >uuMx + ^ZvVx)Ux 
+ Vxx + (<KuUx + ®$ V V X )V X . 

is the same with each x replaced by y. We form the sum V 2 <f>. In it, 3>* u = is 
multiplied by 

which is 0 by the Cauchy-Riemann equations. Also V 2 w = 0 and V 2 u = 0. There remains 

v 2 $ = $>* u (u x 2 + u v 2 ) + $* u (v 2 + v y 2 ). 

By the Cauchy-Riemann equations this becomes 


V 2 $ = (<!>*„ + <!>* u )(u x 2 + v 2 ) 


and is 0 since <E>* is harmonic. 
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Tables 


For Tables of Laplace transforms see Secs* 6*8 and 6*9* 

For Tables of Fourier transforms see Sec* 11*10* 

If you have a Computer Algebra System (CAS), you may not need the present tables, 
but you may still find them convenient from time to time . 

Table A1 Bessel Functions 


For more extensive tables see Ref. [GR1] in App. 1. 


X 

Mx) 

AW 

X 

AW 

AW 

X 

AW 

Mx) 

0.0 

1.0000 

0.0000 

3.0 

-0.2601 

0.3391 

6.0 

0.1506 

-0.2767 

0.1 

0.9975 

0.0499 

3.1 

-0.2921 

0.3009 

6.1 

0.1773 

-0.2559 

0.2 

0.9900 

0.0995 

3.2 

-0.3202 

0.2613 

6.2 

0.2017 

-0.2329 

0.3 

0.9776 

0.1483 

3.3 

-0.3443 

0.2207 

6.3 

0.2238 

-0.2081 

0.4 

0.9604 

0.1960 

3.4 

-0.3643 

0.1792 

6.4 

0.2433 

-0.1816 

0.5 

0.9385 

0.2423 

3.5 

-0.3801 

0.1374 

6.5 

0.2601 

-0.1538 

0.6 

0.9120 

0.2867 

3.6 

-0.3918 

0.0955 

6.6 

0.2740 

-0.1250 

0.7 

0.8812 

0.3290 

3.7 

-0.3992 

0.0538 

6.7 

0.2851 

-0.0953 

0.8 

0.8463 

0.3688 

3.8 

-0.4026 

0.0128 

6.8 

0.2931 

-0.0652 

0.9 

0.8075 

0.4059 

3.9 

-0.4018 

-0.0272 

6.9 

0.2981 

-0.0349 

1.0 

0.7652 

0.4401 

4.0 

-0.3971 

-0.0660 

7.0 

0.3001 

-0.0047 

1.1 

0.7196 

0.4709 

4.1 

-0.3887 

-0.1033 

7.1 

0.2991 

0.0252 

1.2 

0.6711 

0.4983 

4.2 

-0.3766 

-0.1386 

7.2 

0.2951 

0.0543 

1.3 

0.6201 

0.5220 

4.3 

-0.3610 

-0.1719 

7.3 

0.2882 

0.0826 

1.4 

0.5669 

0.5419 

4.4 

-0.3423 

-0.2028 

7.4 

0.2786 

0.1096 

1.5 

0.5118 

0.5579 

4.5 

-0.3205 

-0.2311 

7.5 

0.2663 

0.1352 

1.6 

0.4554 

0.5699 

4.6 

-0.2961 

-0.2566 

7.6 

0.2516 

0.1592 

1.7 

0.3980 

0.5778 

4.7 

-0.2693 

-0.2791 

7.7 

0.2346 

0.1813 

1.8 

0.3400 

0.5815 

4.8 

-0.2404 

-0.2985 

7.8 

0.2154 

0.2014 

1.9 

0.2818 

0.5812 

4.9 

-0.2097 

-0.3147 

7.9 

0.1944 

0.2192 

2.0 

0.2239 

0.5767 

5.0 

-0.1776 

-0.3276 

8.0 

0.1717 

0.2346 

2.1 

0.1666 

0.5683 

5.1 

-0.1443 

-0.3371 

8.1 

0.1475 

0.2476 

2.2 

0.1104 

0.5560 

5.2 

-0.1103 

-0.3432 

8.2 

0.1222 

0.2580 

2.3 

0.0555 

0.5399 

5.3 

-0.0758 

-0.3460 

8.3 

0.0960 

0.2657 

2.4 

0.0025 

0.5202 

5.4 

-0.0412 

-0.3453 

8.4 

0.0692 

0.2708 

2.5 

-0.0484 

0.4971 

5.5 

-0.0068 

-0.3414 

8.5 

0.0419 

0.2731 

2.6 

-0.0968 

0.4708 

5.6 

0.0270 

-0.3343 

8.6 

0.0146 

0.2728 

2.7 

-0.1424 

0.4416 

5.7 

0.0599 

-0.3241 

8.7 

-0.0125 

0.2697 

2.8 

-0.1850 

0.4097 

5.8 

0.0917 

-0.3110 

8.8 

-0.0392 

0.2641 

2.9 

-0.2243 

0.3754 

5.9 

0.1220 

-0.2951 

8.9 

-0.0653 

0.2559 


J 0 (x) = 0 for a - = 2.40483. 5.52008, 8.65373, 11.7915, 14.9309, 18.0711, 21.21 16, 24.3525, 27.4935, 30.6346 
J iM = 0 for x = 3.83171, 7.01559, 10.1735, 13.3237, 16.4706, 19.6159, 22.7601, 25.9037, 29.0468, 32.1897 
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Table A1 (continued) 


X 

w 

nw 

X 

Y 0 ( x ) 

Y 1 { x ) 

X 

w 

J ' tC - v ) 

0.0 



2.5 

0.498 

0.146 

5.0 

- 0.309 

0.148 

0.5 

- 0.445 

- 1.471 

3.0 

0.377 

0.325 

5.5 

- 0.339 

- 0.024 

1.0 

0.088 

- 0.781 

3.5 

0.189 

0.410 

6.0 

- 0.288 

- 0.175 

1.5 

0.382 

- 0.412 

4.0 

- 0.017 

0.398 

6.5 

- 0.173 

- 0.274 

2.0 

0.510 

- 0.107 

4.5 

- 0.195 

0.301 

7.0 

- 0.026 

- 0.303 


Table A2 Gamma Function [see (24) in App. A3.1] 


a 

T ( a ) 

a 

n «) 

a 

r ( a ) 

a 

T(a) 

a 

r ( a ) 

1.00 

1.000 000 

1.20 

0.918 169 

1.40 

0.887 264 

1.60 

0.893 515 

1.80 

0.931 384 

1.02 

0.988 844 

1.22 

0.913 106 

1.42 

0.886 356 

1.62 

0.895 924 

1.82 

0.936 845 

1.04 

0.978 438 

1.24 

0,908 521 

1.44 

0.885 805 

1.64 

0.898 642 

1.84 

0.942 612 

1.06 

0.968 744 

1.26 

0.904 397 

1.46 

0.885 604 

1.66 

0.901 668 

1.86 

0.948 687 

1.08 

0.959 725 

1.28 

0.900 718 

1.48 

0.885 747 

1.68 

0.905 001 

1.88 

0.955 071 

1.10 

0.951 351 

1.30 

0.897 471 

1.50 

0.886 227 

1.70 

0.908 639 

1.90 

0.961 766 

1.12 

0.943 590 

1.32 

0.894 640 

1.52 

0.887 039 

1.72 

0.912 581 

1.92 

0.968 774 

1.14 

0.936 416 

1.34 

0.892 216 

1.54 

0.888 178 

1.74 

0.916 826 

1.94 

0.976 099 

1.16 

0.929 803 

1.36 

0.890 185 

1.56 

0.889 639 

1.76 

0.921 375 

1.96 

0.983 743 

1.18 

0.923 728 

1.38 

0.888 537 

1.58 

0.891 420 

1.78 

0.926 227 

1.98 

0.991 708 

1.20 

0.918 169 

1.40 

0.887 264 

1.60 

0.893 515 

1.80 

0.931 384 

2.00 

1.000 000 


Table A3 Factorial Function and Its Logarithm with Base 10 


n 

n \ 

log (/?!) 

n 

n \ 

log («!) 

n 

n ) 

log ( n !) 

1 

1 

0.000 000 

6 

720 

2.857 332 

11 

39 916 800 

7.601 156 

2 

2 

0.301 030 

7 

5 040 

3.702 431 

12 

479 001 600 

8.680 337 

3 

6 

0.778 151 

8 

40 320 

4.605 521 

13 

6 227 020 800 

9.794 280 

4 

24 

1.380 211 

9 

362 880 

5.559 763 

14 

87 178 291 200 

10.940 408 

5 

120 

2.079 181 

10 

3 628 800 

6.559 763 

15 

1 307 674 368 000 

12.116 500 


Table A4 Error Function, Sine and Cosine Integrals [see (35), (40), (42) in App. A3.1] 


.V 

erf a* 

Sito 

ei (. v ) 

A* 

erf a* 

Si « 

Ci(A) 

0.0 

0.0000 

0.0000 

00 

2.0 

0.9953 

1.6054 

- 0.4230 

0.2 

0.2227 

0.1996 

1.0422 

2.2 

0.9981 

1.6876 

- 0.3751 

0,4 

0.4284 

0.3965 

0.3788 

2.4 

0.9993 

1.7525 

- 0.3173 

0.6 

0.6039 

0.5881 

0.0223 

2.6 

0.9998 

1.8004 

- 0.2533 

0.8 

0.7421 

0.7721 

- 0.1983 

2.8 

0.9999 

1.8321 

- 0.1865 

1.0 

0.8427 

0.9461 

- 0.3374 

3.0 

1.0000 

1.8487 

- 0.1196 

1.2 

0.9103 

1.1080 

- 0.4205 

3.2 

1. 0000 

1.8514 

- 0.0553 

1.4 

0.9523 

1.2562 

- 0.4620 

3.4 

1.0000 

1.8419 

0.0045 

1.6 

0.9763 

1.3892 

- 0.4717 

3.6 

1.0000 1 

1.8219 

0.0580 

1.8 

0.9891 

1.5058 

- 0.4568 

3.8 

1.0000 

1.7934 

0.1038 

2.0 

0.9953 

1.6054 

- 0.4230 

4.0 

1.0000 

1.7582 

0.1410 | 
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Table A5 Binomial Distribution 


Probability function f(x) [see (2), Sec. 24.7] and distribution function F(a) 




P- 

= 0.1 

P = 

= 0.2 

P = 

= 0.3 

P 

= 0.4 

P-- 

= 0.5 

n 

A* 

m 

Fix) 

m 

Fix) 

/(•V) 

Fix) 

fix) 

Fix) 

fix) 

Fix) 



0 . 


0 . 


0 . 


0 . 


0 . 


1 

0 

9000 

0.9000 

8000 

0.8000 

7000 

0.7000 

6000 

0.6000 

5000 

0.5000 


1 

1000 

1.0000 

2000 

1.0000 

3000 

1.0000 

4000 

1.0000 

5000 

1.0000 


0 

8100 

0.8100 

6400 

0.6400 

4900 

0.4900 

3600 

0.3600 

2500 

0.2500 

2 

1 

1800 

0.9900 

3200 

0.9600 

4200 

0.9100 

4800 

0.8400 

5000 

0.7500 


2 

0100 

1.0000 

0400 

1.0000 

0900 

1.0000 

1600 

1.0000 

2500 

1.0000 


0 

7290 

0.7290 

5120 

0.5120 

3430 

0.3430 

2160 

0.2160 

1250 

0.1250 

-5 

1 

2430 

0.9720 

3840 

0.8960 

4410 

0.7840 

4320 

0.6480 

3750 

0.5000 

5 

2 

0270 

0.9990 

0960 

0.9920 

1890 

0.9730 

2880 

0.9360 

3750 

0.8750 


3 

0010 

1.0000 

0080 

1.0000 

0270 

1.0000 

0640 

1.0000 

1250 

1.0000 


0 

6561 

0.6561 

4096 

0.4096 

2401 

0.2401 

1296 

0.1296 

0625 

0.0625 


1 

2916 

0.9477 

4096 

0.8192 

4116 

0.6517 

3456 

0.4752 

2500 

0.3125 

4 

2 

0486 

0.9963 

1536 

0.9728 

2646 

0.9163 

3456 

0.8208 

3750 

0.6875 


3 

0036 

0.9999 

0256 

0.9984 

0756 

0.9919 

1536 

0.9744 

2500 

0.9375 


4 

0001 

1.0000 

0016 

1.0000 

0081 

1.0000 

0256 

1.0000 

0625 

1.0000 


0 

5905 

0.5905 

3277 

0.3277 

1681 

0.1681 

0778 

0.0778 

0313 

0.0313 


1 

3281 

0.9185 

4096 

0.7373 

3602 

0.5282 

2592 

0.3370 

1563 

0.1875 

f 

2 

0729 

0.9914 

2048 

0.9421 

3087 

0.8369 

3456 

0.6826 

3125 

0.5000 

j 

3 

0081 

0.9995 

0512 

0.9933 

1323 

0.9692 

2304 

0.9130 

3125 

0.8125 


4 

0005 

1.0000 

0064 

0.9997 

0284 

0.9976 

0768 

0.9898 

1563 

0.9688 


5 

0000 

1.0000 

0003 

1.0000 

0024 

1.0000 

0102 

1.0000 

0313 

1.0000 


0 

5314 

0.5314 

2621 

0.2621 

1176 

0.1176 

0467 

0.0467 

0156 

0.0156 


1 

3543 

0.8857 

3932 

0.6554 

3025 

0.4202 

1866 

0.2333 

0938 

0.1094 


2 

0984 

0.9841 

2458 

0.9011 

3241 

0.7443 

3110 

0.5443 

2344 

0.3438 

6 

3 

0146 

0.9987 

0819 

0.9830 

1852 

0.9295 

2765 

0.8208 

3125 

0.6563 


4 

0012 

0.9999 

0154 

0.9984 

0595 

0.9891 

1382 

0.9590 

2344 

0.8906 


5 

0001 

1.0000 

0015 

0.9999 

0102 

0.9993 

0369 

0.9959 

0938 

0.9844 


6 

0000 

1.0000 

0001 

1.0000 

0007 

1.0000 

0041 

1.0000 

0156 

1. 0000 


0 

4783 

0.4783 

2097 

0.2097 

0824 

0.0824 

0280 

0.0280 

0078 

0.0078 


1 

3720 

0.8503 

3670 

0.5767 

2471 

0.3294 

1306 

0.1586 

0547 

0.0625 


2 

1240 

0.9743 

2753 

0.8520 

3177 

0.6471 

2613 

0.4199 

1641 

0.2266 

7 

3 

0230 

0.9973 

1147 

0.9667 

2269 

0.8740 

2903 

0.7102 

2734 

0.5000 


4 

0026 

0.9998 

0287 

0.9953 

0972 

0.9712 

1935 

0.9037 

2734 

0.7734 


5 

0002 

1.0000 

0043 

0.9996 

0250 

0.9962 

0774 

0.9812 

1641 

0.9375 


6 

0000 

1.0000 

0004 

1.0000 

0036 

0.9998 

0172 

0.9984 

0547 

0.9922 


7 

0000 

1.0000 

0000 

1.0000 

0002 

1.0000 

0016 

1.0000 

0078 

1.0000 


0 

4305 

0.4305 

1678 

0.1678 

0576 

0.0576 

0168 

0.0168 

0039 

0.0039 


1 

3826 

0.8131 

3355 

0.5033 

1977 

0.2553 

0896 

0.1064 

0313 

0.0352 


2 

1488 

0.9619 

2936 

0.7969 

2965 

0.5518 

2090 

0.3154 

1094 

0.1445 


3 

0331 

0.9950 

1468 

0.9437 

2541 

0.8059 

2787 

0.5941 

2188 

0.3633 

8 

4 

0046 

0.9996 

0459 

0.9896 

1361 

0.9420 

2322 

0.8263 

2734 

0.6367 


5 

0004 

1.0000 

0092 

0.9988 

0467 

0.9887 

1239 

0.9502 

2188 

0.8555 


6 

0000 

1.0000 

0011 

0.9999 

0100 

0.9987 

0413 

0.9915 

1094 

0.9648 


7 

0000 

1.0000 

0001 

1.0000 

0012 

0.9999 

0079 

0.9993 

0313 

0.9961 


8 

0000 

1.0000 

0000 

1.0000 

0001 

1.0000 

0007 

1.0000 

0039 

1.0000 
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Table A6 Poisson Distribution 


Probability function f(x) [see (5), Sec. 24.7] and distribution function F(x) 


X 

A = 
fix) 

- tC 
o 

II 


II 

o 

io 

fix) 

= 0.3 
Fix) 

A = 
/(-v) 

ii 

o 

M = 

/(.v) 

n 

p 
*■?! iff 


0 . 


0 . 


0 . 


0 . 


0 . 


0 

9048 

0.9048 

8187 

0.8187 

7408 

0.7408 

6703 

0.6703 

6065 

0.6065 

1 

0905 

0.9953 

1637 

0.9825 

2222 

0.9631 

2681 

0.9384 

3033 

0.9098 

2 

0045 

0.9998 

0164 

0.9989 

0333 

0.9964 

0536 

0.9921 

0758 

0.9856 

3 

0002 

1.0000 

0011 

0.9999 

0033 

0.9997 

0072 

0.9992 

0126 

0.9982 

4 

5 

0000 

1.0000 

0001 

1.0000 

0003 

1.0000 

0007 

0001 

0.9999 

1.0000 

0016 

0002 

0.9998 

1.0000 


.V 

fix) 

ii 

o 

IZ' 

M = 
fix) 

ii 

o 

* 

3.^ 

* 

0.8 

Fix) 

P = 
fix) 

= 0.9 
Fix) 

& 

/(.v) 

II 

*3 ■” 

& 


0 . 


0. 


0 . 


0 . 


0 . 


0 

5488 

0.5488 

4966 

0.4966 

4493 

0.4493 

4066 

0.4066 

3679 

0.3679 

1 

3293 

0.8781 

3476 

0.8442 

3595 

0.8088 

3659 

0.7725 

3679 

0.7358 

2 

0988 

0.9769 

1217 

0.9659 

1438 

0.9526 

1647 

0.9371 

1839 

0.9197 

3 

0198 

0.9966 

0284 

0.9942 

0383 

0.9909 

0494 

0.9865 

0613 

0.9810 

4 

0030 

0.9996 

0050 

0.9992 

0077 

0.9986 

0111 

0.9977 

0153 

0.9963 

5 

0004 

1.0000 

0007 

0.9999 

0012 

0.9998 

0020 

0.9997 

0031 

0.9994 

6 



0001 

1.0000 

0002 

1.0000 

0003 

1.0000 

0005 

0.9999 

7 









0001 

1.0000 


.V 

/(.V) 

= 1.5 
Fix) 

fix) 

= 2 
Fix) 

fix) 

II 

£ ^ 

fix) 

* * 
H 

fix) 

H 

o 


0. 


0. 


0. 


0. 


0. 


0 

2231 

0.2231 

1353 

0.1353 

0498 

0.0498 

0183 

0.0183 

0067 

0.0067 

1 

3347 

0.5578 

2707 

0.4060 

1494 

0.1991 

0733 

0.0916 

0337 

0.0404 

2 

2510 

0.8088 

2707 

0.6767 

2240 

0.4232 

1465 

0.2381 

0842 

0.1247 

3 

1255 

0.9344 

1804 

0.8571 

2240 

0.6472 

1954 

0.4335 

1404 

0.2650 

4 

0471 

0.9814 

0902 

0.9473 

1680 

0.8153 

1954 

0.6288 

1755 

0.4405 

5 

0141 

0.9955 

0361 

0.9834 

1008 

0.9161 

1563 

0.7851 

1755 

0.6160 

6 

0035 

0.9991 

0120 

0.9955 

0504 

0.9665 

1042 

0.8893 

1462 

0.7622 

7 

0008 

0.9998 

0034 

0.9989 

0216 

0.9881 

0595 

0.9489 

1044 

0.8666 

8 

0001 

1.0000 

0009 

0.9998 

0081 

0.9962 

0298 

0.9786 

0653 

0.9319 

9 



0002 

1.0000 

0027 

0.9989 

0132 

0.9919 

0363 

0.9682 

10 





0008 

0.9997 

0053 

0.9972 

0181 

0.9863 

11 





0002 

0.9999 

0019 

0.9991 

0082 

0.9945 

12 





0001 

1.0000 

0006 

0.9997 

0034 

0.9980 

13 







0002 

0.9999 

0013 

0.9993 

14 







0001 

1.0000 

0005 

0.9998 

15 









0002 

0.9999 

16 









0000 

1.0000 
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Table A7 Normal Distribution 

Values of the distribution function <I>(z) [see (3), Sec. 24.8]. 4>(— z) — 1 ~ 


7 

4>(s) 

z 

4>(z) 

z 

<I>(z) 

z 

<I>(z) 

z 

<K(z) 

z 



0 . 


0 . 


0 . 


0 . 


0 . 


0 . 

0.01 

5040 

0.51 

6950 

1.01 

8438 

1.51 

9345 

2.01 

9778 

2.51 

9940 

0.02 

5080 

0.52 

6985 

1.02 

8461 

1.52 

9357 

2.02 

9783 

2.52 

9941 

0.03 

5120 

0.53 

7019 

1.03 

8485 

1.53 

9370 

2.03 

9788 

2.53 

9943 

0.04 

5160 

0.54 

7054 

1.04 

8508 

1.54 

9382 

2.04 

9793 

2.54 

9945 

0.05 

5199 

0.55 

7088 

1.05 

8531 

1.55 

9394 

2.05 

9798 

2.55 

9946 

0.06 

5239 

0.56 

7123 

1.06 

8554 

1.56 

9406 

2.06 

9803 

2.56 

9948 

0.07 

5279 

0.57 

7157 

1.07 

8577 

1.57 

9418 

2.07 

9808 

2.57 

9949 

0.08 

5319 

0.58 

7190 

1.08 

8599 

1.58 

9429 

2.08 

9812 

2.58 

9951 

0.09 

5359 

0.59 

7224 

1.09 

8621 

1.59 

9441 

2.09 

9817 

2.59 

9952 

0.10 

5398 

0.60 

7257 

1.10 

8643 

1.60 

9452 

2.10 

9821 

2.60 

9953 

0.1 1 

5438 

0.61 

7291 

1.11 

8665 

1.61 

9463 

2.11 

9826 

2.61 

9955 

0.12 

5478 

0.62 

7324 

1.12 

8686 

1.62 

9474 

2.12 

9830 

2.62 

9956 

0.13 

5517 

0.63 

7357 

1.13 

8708 

1.63 

9484 

2.13 

9834 

2.63 

9957 

0.14 

5557 

0.64 

7389 

1.14 

8729 

1.64 

9495 

2.14 

9838 

2.64 

9959 

0.15 

5596 

0.65 

7422 

1.15 

8749 

1.65 

9505 

2.15 

9842 

2.65 

9960 

0.16 

5636 

0.66 

7454 

1.16 

8770 

1.66 

9515 

2.16 

9846 

2.66 

9961 

0.17 

5675 

0.67 

7486 

1.17 

8790 

1.67 

9525 

2.17 

9850 

2.67 

9962 

0.18 

5714 

0.68 

7517 

1.18 

8810 

1.68 

9535 

2.18 

9854 

2.68 

9963 

0.19 

5753 

0.69 

7549 

1.19 

8830 

1.69 

9545 

2.19 

9857 

2.69 

9964 

0.20 

5793 

0.70 

7580 

1.20 

8849 

1.70 

9554 

2.20 

9861 

2.70 

9965 

0.21 

5832 

0.71 

7611 

1.21 

8869 

1.71 

9564 

2.21 

9864 

2.71 

9966 

0.22 

5871 

0.72 

7642 

1.22 

8888 

1.72 

9573 

2.22 

9868 

2.72 

9967 

0.23 

5910 

0.73 

7673 

1.23 

8907 

1.73 

9582 

2.23 

9871 

2.73 

9968 

0.24 

5948 

0.74 

7704 

1.24 

8925 

1.74 

9591 

2.24 

9875 

2.74 

9969 

0.25 

5987 

0.75 

7734 

1.25 

8944 

1.75 

9599 

2.25 

9878 

2.75 

9970 

0.26 

6026 

0.76 

7764 

1.26 

8962 

1.76 

9608 

2.26 

9881 

2.76 

9971 

0.27 

6064 

0.77 

7794 

1.27 

8980 

1.77 

9616 

2.27 

9884 

2.77 

9972 

0.28 

6103 

0.78 

7823 

1.28 

8997 

1.78 

9625 

2.28 

9887 

2.78 

9973 

0.29 

6141 

0.79 

7852 

1.29 

9015 

1.79 

9633 

2.29 

9890 

2.79 

9974 

0.30 

6179 

0.80 

7881 

1.30 

9032 

1.80 

9641 

2.30 

9893 

2.80 

9974 

0.31 

6217 

0.81 

7910 

1.31 

9049 

1.81 

9649 

2.31 

9896 

2.81 

9975 

0.32 

6255 

0.82 

7939 

1.32 

9066 

1.82 

9656 

2.32 

9898 

2.82 

9976 

0.33 

6293 

0.83 

7967 

1.33 

9082 

1.83 

9664 

2.33 

9901 

2.83 

9977 

0.34 

6331 

0.84 

7995 

1.34 

9099 

1.84 

9671 

2.34 

9904 

2.84 

9977 

0.35 

6368 

0.85 

8023 

1.35 

9115 

1.85 

9678 

2.35 

9906 

2.85 

9978 

0.36 

6406 

0.86 

8051 

1.36 

9131 

1.86 

9686 

2.36 

9909 

2.86 

9979 

0.37 

6443 

0.87 

8078 

1.37 

9147 

1.87 

9693 

2.37 

9911 

2.87 

9979 

0.38 

6480 

0.88 

8106 

1.38 

9162 

1.88 

9699 

2.38 

9913 

2.88 

9980 

0.39 

6517 

0.89 

8133 

1.39 

9177 

1.89 

9706 

2.39 

9916 

2.89 

9981 

0.40 

6554 

0.90 

8159 

1.40 

9192 

1.90 

9713 

2.40 

9918 

2.90 

9981 

0.41 

6591 

0.91 

8186 

1.41 

9207 

1.91 

9719 

2.41 

9920 

2.91 

9982 

0.42 

6628 

0.92 

8212 

1.42 

9222 

1.92 

9726 

2.42 

9922 

2.92 

9982 

0.43 

6664 

0.93 

8238 

1.43 

9236 

1.93 

9732 

2.43 

9925 

2.93 

9983 

0.44 

6700 

0.94 

8264 

1.44 

9251 

1.94 

9738 

2.44 

9927 

2.94 

9984 

0.45 

6736 

0.95 

8289 

1.45 

9265 

1.95 

9744 

2.45 

9929 

2.95 

9984 

0.46 

6772 

0.96 

8315 

1.46 

9279 

1.96 

9750 

2.46 

9931 

2.96 

9985 

0.47 

6808 

0.97 

8340 

1.47 

9292 

1.97 

9756 

2.47 

9932 

2.97 

9985 

0.48 

6844 

0.98 

8365 

1.48 

9306 

1.98 

9761 

2.48 

9934 

2.98 

9986 

0.49 

6879 

0.99 

8389 

1.49 

9319 

1.99 

9767 

2.49 

9936 

2.99 

9986 

0.50 

6915 | 

1.00 

8413 

1.50 

9332 

2.00 

9772 

2.50 

9938 

3.00 

9987 
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Table A8 Normal Distribution 

Values of z for given values of <£(?) [see (3), Sec. 24.8] and D(z) = — €>(— z) 

Example: z = 0.279 if $>(z) = 61%; z = 0.860 if D(z) = 61%. 
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Table A9 t-Distribution 


Values of z for given values of the distribution function F(z) (see (8) in Sec. 25.3). 
Example: For 9 degrees of freedom, z = 1.83 when F(z ) = 0.95. 


F(z) 

1 

2 

3 

Number 

4 

of Degree 
5 

;s of Free < 
6 

iom 

7 

8 

9 

10 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.32 

0.29 

0.28 

0.27 

0.27 

0.26 

0.26 

0.26 

0.26 

0.26 

0.7 

0.73 

0.62 

0.58 

0.57 

0.56 

0.55 

0.55 

0.55 

0.54 

0.54 

0.8 

1.38 

1.06 

0.98 

0.94 

0.92 

0.91 

0.90 

0.89 

0.88 

0.88 

0.9 

3.08 

1.89 

1.64 

1.53 

1.48 

1.44 

1.41 

1.40 

1.38 

1.37 

0.95 

6.31 

2.92 

2.35 

2.13 | 

2.02 

1.94 

1.89 

1.86 

1.83 

1.81 

0.975 

12.7 

4.30 

3.18 

2.78 

2.57 

2.45 

2.36 

2.31 

2.26 

2.23 

0.99 

31.8 

6.96 

4.54 

3.75 

3.36 

3.14 

3.00 

2.90 

2.82 

2.76 

0.995 

63.7 

9.92 

5.84 

4.60 

4.03 

3.71 

3.50 

3.36 

3.25 

3.17 

0.999 

318.3 

22.3 

10.2 

7.17 

5.89 

5.21 

4.79 

4.50 

4.30 

4.14 


F(z) 

11 

12 

13 

Number 

14 

of Degree 
15 

;s of Free < 
16 

lom 

17 

18 

19 

20 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.7 

0.54 

0.54 

0.54 

0.54 

0.54 

0.54 

0.53 

0.53 

0.53 

0.53 

0.8 

0.88 

0.87 

0.87 

0.87 

0.87 

0.86 

0.86 

0.86 

0.86 

0.86 

0.9 

1.36 

1.36 

1.35 

1.35 

1.34 

1.34 

1.33 

1.33 

1.33 

1.33 

0.95 

1.80 

1.78 

1.77 

1.76 

1.75 

1.75 

1.74 

1.73 

1.73 

1.72 

0.975 

2.20 

2.18 

2.16 

2.14 

2.13 

2.12 

2.11 

2.10 

2.09 

2.09 

0.99 

2.72 

2.68 

2.65 

2.62 

2.60 

2.58 

2.57 

2.55 

2.54 

2.53 

0.995 

3.11 

3.05 

3.01 

2.98 

2.95 

2.92 

2.90 

2.88 

2.86 

2.85 

0.999 

4.02 

3.93 

3.85 

3.79 

3.73 

3.69 

3.65 

3.61 

3.58 

3.55 






Number 

of Degrees of Freedom 




F(z) 

22 

24 

26 

28 

30 

40 

50 

100 

200 

00 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.25 

0.25 

0.25 

0.25 

0.7 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.52 

0.8 

0.86 

0.86 

0.86 

0.85 

0.85 

0.85 

0.85 

0.85 

0.84 

0.84 

0.9 

1.32 

1.32 

1.31 

1.31 

1.31 

1.30 

1.30 

1.29 

1.29 

1.28 

0.95 

1.72 

1.71 

1.71 

1.70 

1.70 

1.68 

1.68 

1.66 

1.65 

1.65 

0.975 

2.07 

2.06 

2.06 

2.05 

2.04 

2.02 

2.01 

1.98 

1.97 

1.96 

0.99 

2.51 

2.49 

2.48 

2.47 

2.46 

2.42 

2.40 

2.36 

2.35 

2.33 

0.995 

2.82 

2.80 

2.78 

2.76 

2.75 

2.70 

2.68 

2.63 

2.60 

2.58 

0.999 

3.50 

3.47 

3.43 

3.41 

3.39 

3.31 

3.26 

3.17 

3.13 

3.09 
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Table A10 Chi-square Distribution 

Values of x for given values of the distribution function F(z) (see Sec. 25.3 before (17)). 


Example: For 3 degrees of freedom, z = 1 1 .34 when F(z) — 0.99. 






Number of Degrees of Freedom 




F(Z) 










10 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0.005 

0.00 

0.01 

0.07 

0.21 

0.41 

0.68 

0.99 

1.34 

1.73 

2.16 

0.01 

0.00 

0.02 

0.11 

0.30 

0.55 

0.87 

1.24 

1.65 

2.09 

2.56 

0.025 

0.00 

0.05 

0.22 

0.48 

0.83 

1.24 

1.69 

2.18 

2.70 

3.25 

0.05 

0.00 

0.10 

0.35 

0.71 

1.15 

1.64 

2.17 

2.73 

3.33 

3.94 

0.95 

3.84 

5.99 

7.81 

9.49 

11.07 

12.59 

14.07 

15.51 

16.92 

18.31 

0.975 

5.02 

7.38 

9.35 

11.14 

12.83 

14.45 

16.01 

17.53 

19.02 

20.48 

0.99 

6.63 

9.21 

11.34 

13.28 

15.09 

16.81 

18.48 

20.09 

21.67 

23.21 

0.995 

7.88 

10.60 

12.84 

14.86 

16.75 

18.55 

20.28 

21.95 

23.59 

25.19 







Number of Degrees of Freedom 




F(z) 









19 

20 

11 

12 

13 

14 

15 

16 

17 

18 

0.005 

2.60 

3.07 

3.57 

4.07 

4.60 

5.14 

5.70 

6.26 

6.84 

7.43 

0.01 

3.05 

3.57 

4.11 

4.66 

5.23 

5.81 

6.41 

7.01 

7.63 

8.26 

0.025 

3.82 

4.40 

5.01 

5.63 

6.26 

6.91 

7.56 

8.23 

8.91 

9.59 

0.05 

4.57 

5.23 

5.89 

6.57 

7.26 

7.96 

8.67 

9.39 

10.12 

10.85 

0.95 

19.68 

21.03 

22.36 

23.68 

25.00 

26.30 

27.59 

28.87 

30.14 

31.41 

0.975 

21.92 

23.34 

24.74 

26.12 

27.49 

28.85 

30.19 

31.53 

32.85 

34.17 

0.99 

24.72 

26.22 

27.69 

29.14 

30.58 

32.00 

33.41 

34.81 

36.19 

37.57 

0.995 

26.76 

28.30 

29.82 

31.32 

32.80 

34.27 

35.72 

37.16 

38.58 

40.00 






Number of Degrees of Freedom 




F(Z) 









29 

30 

21 

22 

23 

24 

25 

26 

27 

28 

0.005 

8.0 

8.6 

9.3 

9.9 

10.5 

11.2 

11.8 

12.5 

13.1 

13.8 

0.01 

8.9 

9.5 

10.2 

10.9 

11.5 

12.2 

12.9 

13.6 

14.3 

15.0 

0.025 

10.3 

11.0 

11.7 

12.4 

I3.J 

13.8 

14.6 

15.3 

16.0 

16.8 

0.05 

11.6 

12.3 

13.1 

13.8 

14.6 

15.4 

16.2 

16.9 

17.7 

18.5 

0.95 

32.7 

33.9 

35.2 

36.4 

37.7 

38.9 

40.1 

41.3 

42.6 

43.8 

0.975 

35.5 

36.8 

38.1 

39.4 

40.6 

41.9 

43.2 

44.5 

45.7 

47.0 

0.99 

38.9 

40.3 

41.6 

43.0 

44.3 

45.6 

47.0 

48.3 

49.6 

50.9 

0.995 

41.4 

42.8 

44.2 

45.6 

46.9 

48.3 

49.6 

51.0 

52.3 

53.7 


F{z) 




Number of Degrees of Freedom 


40 

50 

60 

70 

80 

90 

100 

> 100 (Approximation) 

0.005 

20.7 

28.0 

35.5 

43.3 

51.2 

59.2 

67.3 

i(h - 2.58 f 

0.01 

22.2 

29.7 

37.5 

45.4 

53.5 

61.8 

70.1 

i(A - 2.33) 2 

0.025 

24.4 

32.4 

40.5 

48.8 

57.2 

65.6 

74.2 

|(/t - 1.96)* 

0.05 

26.5 

34.8 

43.2 

51.7 

60.4 

69.1 

77.9 

- 1.64) 2 

0.95 

55.8 

67.5 

79.1 

90.5 

101.9 

113.1 

124.3 

£(/t + 1.64) 2 

0.975 

59.3 

71.4 

83.3 

95.0 

106.6 

118.1 

129.6 

+ ] ,96) 2 

0.99 

63.7 

76.2 

88.4 

100.4 

112.3 

124.1 

135.8 

i(A + 2.33) 2 

0.995 

66.8 

79.5 

92.0 

104.2 

116.3 

128.3 

140.2 

§(A + 2.58) 2 


In the Inst column, h = V2 m — l t where n\ is the number of degrees of freedom. 
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Table All F-Distribution with (m, n) Degrees of Freedom 

Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0*95 
Example: For (7, 4) d.f., z = 6.09 if F(z) = 0.95. 


n 

m = 1 


m = 3 

m — 4 

m — 5 

m = 6 

m = 7 

3 

II 

oo 

ON 

II 

3 

I 

161 

200 

216 

225 

230 

234 

237 

239 

241 

2 

18.5 

19.0 

19.2 

19.2 

19.3 

19.3 

19.4 

19.4 

19.4 

3 

10.1 

9.55 

9.28 

9.12 

9.01 

8.94 

8.89 

8.85 

8.81 

4 

7.71 

6.94 

6.59 

6.39 

6.26 

6.16 

6.09 

6.04 

6.00 

5 

6.61 

5.79 

5.41 

5.19 

5.05 

4.95 

4.88 

4,82 

4.77 

6 

5.99 

5.14 

4.76 

4.53 

4.39 

4.28 

4.21 

4.15 

4.10 

7 

5.59 

4.74 

4.35 

4.12 

3.97 

3.87 

3.79 

3.73 

3.68 

8 

5.32 

4.46 

4.07 

3.84 

3.69 

3.58 

3.50 

3.44 

3.39 

9 

5.12 

4.26 

3.86 

3.63 

3.48 

3.37 

3.29 

3.23 

3.18 

10 

4.96 

4.10 

3.71 

3.48 

3.33 

3.22 

3.14 

3.07 

3.02 

11 

4.84 

3.98 

3.59 

3.36 

3.20 

3.09 

3.01 

2.95 

2.90 

12 

4.75 

3.89 

3.49 

3.26 

3.11 

3.00 

2.91 

2,85 

2.80 

13 

4.67 

3.81 

3.41 

3.18 

3.03 

2.92 

2.83 

2.77 

2.71 

14 

4.60 

3.74 

3.34 

3.11 

2.96 

2.85 

2.76 

2.70 

2.65 

15 

4.54 

3.68 

3.29 

3.06 

2.90 

2.79 

2.71 

2.64 

2.59 

16 

4.49 

3.63 

3.24 

3.01 

2.85 

2.74 

2.66 

2.59 

2.54 

17 

4.45 

3.59 

3.20 

2.96 

2.81 

2.70 

2.61 

2.55 

2.49 

18 

4.41 

3.55 

3.16 

2.93 

2.77 

2.66 

2.58 

2.51 

2.46 

19 

4.38 

3.52 

3.13 

2.90 

2.74 

2.63 

2.54 

2.48 

2.42 

20 

4.35 

3.49 

3.10 

2.87 

2.71 

2.60 

2.51 

2.45 

2.39 

22 

4.30 

3.44 

3.05 

2.82 

2.66 

2.55 

2.46 

2.40 

2.34 

24 

4.26 

3.40 

3.01 

2.78 

2.62 

2.51 

2.42 

2.36 

2.30 

26 

4.23 

3.37 

2.98 

2.74 

2.59 

2.47 

2.39 

2.32 

2.27 

28 

4.20 

3.34 

2.95 

2.71 

2.56 

2.45 

2.36 

2.29 

2.24 

30 

4.17 

3.32 

2.92 

2.69 

2.53 

2.42 

2.33 

2.27 

2.21 

32 

4.15 

3.29 

2.90 

2.67 

2.51 

2.40 

2.31 

2.24 

2.19 

34 

4.13 

3.28 

2.88 

2.65 

2.49 

2.38 

2.29 

2.23 

2.17 

36 

4.11 

3.26 

2.87 

2.63 

2.48 

2.36 

2.28 

2.21 

2.15 

38 

4.10 

3.24 

2.85 

2.62 

2.46 

2.35 

2.26 

2.19 

2.14 

40 

4.08 

3.23 

2.84 

2.61 

2.45 

2.34 

2.25 

2.18 

2.12 

50 

4.03 

3.18 

2.79 

2.56 

2.40 

2.29 

2.20 

2.13 

2,07 

60 

4.00 

3.15 

2.76 

2.53 

2.37 

2.25 

2.17 

2.10 

2.04 

70 

3.98 

3.13 

2.74 

2.50 

2.35 

2.23 

2.14 

2.07 

2,02 

80 

3.96 

3.11 

2.72 

2.49 

2.33 

2.21 

2.13 

2.06 

2.00 

90 

3.95 

3.10 

2.71 

2.47 

2.32 

2.20 

2.11 

2.04 

1.99 

100 

3.94 

3.09 

2.70 

2.46 

2.31 

2.19 

2.10 

2.03 

1.97 

150 

3.90 

3.06 

2.66 

2.43 

2.27 

2.16 

2.07 

2.00 

1.94 

200 

3.89 

3.04 

2.65 

2.42 

2.26 

2.14 

2.06 

1.98 

1.93 

1000 

3.85 

3.00 

2.61 

2.38 

2.22 

2.11 

2.02 

1.95 

1.89 

oc 

3.84 

3.00 

2.60 

2.37 

2.21 

2.10 

2.01 

1.94 

1.88 
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Table All F-Distribution with (m, n) Degrees of Freedom (continued) 

Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0.99 


n 


sm 


nt — 4 

m - 5 

m = 6 

m = 7 

m = 8 

m = 9 

1 

4052 

4999 

5403 

5625 

5764 

5859 

5928 

5981 

6022 

2 

98.5 

99.0 

99.2 

99.2 

99.3 

99.3 

99.4 

99.4 

99.4 

3 

34.1 

30.8 

29.5 

28.7 

28.2 

27.9 

27.7 

27.5 

27.3 

4 

21.2 

18.0 

16.7 

16.0 

15.5 

15,2 

15.0 

14.8 

14.7 

5 

16.3 

13.3 

12.1 

11.4 

11.0 

10,7 

10.5 

10.3 

10.2 

6 

13.7 

10.9 

9.78 

9.15 

8.75 

8.47 

8.26 

8.10 

7.98 

7 

12.2 

9.55 

8.45 

7.85 

7.46 

7.19 

6.99 

6.84 

6.72 

8 

11.3 

8.65 

7.59 

7.01 

6.63 

6.37 

6.18 

6.03 

5.91 

9 

10.6 

8.02 

6.99 

6.42 

6.06 

5.80 

5.61 

5.47 

5.35 

10 

10.0 

7.56 

6.55 

5.99 

5.64 

5.39 

5.20 

5.06 

4.94 

11 

9.65 

7.21 

6.22 

5.67 

5.32 

5.07 

4.89 

4.74 

4.63 

12 

9.33 

6.93 

5.95 

5.41 

5.06 

4.82 

4.64 

4.50 

4.39 

13 

9.07 

6.70 

5.74 

5.21 

4.86 

4.62 

4.44 

4.30 

4.19 

14 

8.86 

6.51 

5.56 

5.04 

4.69 

4.46 

4.28 

4.14 

4.03 

15 

8.68 

6.36 

5.42 

4.89 

4.56 

4.32 

4.14 

4.00 

3.89 

16 

8.53 

6.23 

5.29 

4.77 

4.44 

4.20 

4.03 

3.89 

3.78 

17 

8.40 

6.11 

5.18 

4.67 

4.34 

4.10 

3.93 

3.79 

3.68 

18 

8.29 

6.01 

5.09 

4.58 

4.25 

4.01 

3.84 

3.71 

3.60 

19 

8.18 

5.93 

5.01 

4.50 

4.17 

3.94 

3.77 

3.63 

3.52 

20 

8.10 

5.85 

4.94 

4.43 

4.10 

3.87 

3.70 

3.56 

3.46 

22 

7.95 

5.72 

4.82 

4.31 

3.99 

3.76 

3.59 

3.45 

3.35 

24 

7.82 

5.61 

4.72 

4.22 

3.90 

3.67 

3.50 

3.36 

3.26 

26 

7.72 

5.53 

4.64 

4.14 

3.82 

3.59 

3.42 

3.29 

3.18 

28 

7.64 

5.45 

4.57 

4.07 

3.75 

3.53 

3.36 

3.23 

3.12 

30 

7.56 

5.39 

4.51 

4.02 

3.70 

3.47 

3.30 

3.17 

3.07 

32 

7.50 

5.34 

4.46 

3.97 

3.65 

3.43 

3.26 

3.13 

3.02 

34 

7.44 

5.29 

4.42 

3.93 

3.61 

3.39 

3.22 

3.09 

2.98 

36 

7.40 

5.25 

4.38 

3.89 

3.57 

3.35 

3.18 

3.05 

2.95 

38 

7.35 

5.21 

4.34 

3.86 

3.54 

3.32 

3.15 

3.02 

2.92 

40 

7.31 

5.18 

4.31 

3.83 

3.51 

3.29 

3.12 

2.99 

2.89 

50 

7.17 

5.06 

4.20 

3.72 

3.41 

3.19 

3.02 

2.89 

2.78 

60 

7.08 

4.98 

4.13 

3.65 

3.34 

3.12 

2.95 

2.82 

2.72 

70 

7.01 

4.92 

4.07 

3.60 

3.29 

3.07 

2.91 

2.78 

2.67 

80 

6.96 

4.88 

4.04 

3.56 

3.26 

3.04 

2.87 

2.74 

2.64 

90 

6.93 

4.85 

4.01 

3.54 

3.23 

3.01 

2.84 

2.72 

2.61 

100 

6.90 

4.82 

3.98 

3.51 

3.21 

2.99 

2.82 

2.69 

2.59 

150 

6.81 

4.75 

3.91 

3.45 

3.14 

2.92 

2.76 

2.63 

2.53 

200 

6.76 

4.71 

3.88 

3.41 

3.11 

2.89 

2.73 

2.60 

2.50 

1000 

6.66 

4.63 

3.80 

3.34 

3,04 

2.82 

2.66 

2.53 

2.43 

30 

6.63 

4.61 

3.78 

3.32 

3.02 

2.80 

2.64 

2,51 

2.41 
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Table All F-Distribution with (m, n) Degrees of Freedom (continued) 

Values of z for which the distribution function F{z) [see (13), Sec. 25.4] has the value 0.99 


n 

m = 10 

m = 15 

m = 20 

m = 30 

m = 40 

m = 50 

m = 100 

00 

1 

6056 

6157 

6209 

6261 

6287 

6303 

6334 

6366 

2 

99.4 

99.4 

99.4 

99.5 

99.5 

99.5 

99.5 

99.5 

3 

27.2 

26.9 

26.7 

26.5 

26.4 

26.4 

26.2 

26.1 

4 

14.5 

14.2 

14.0 

13.8 

13.7 

13.7 

13.6 

13.5 

5 

10.1 

9.72 

9.55 

9.38 

9.29 

9.24 

9.13 

9.02 

6 

7.87 

7.56 

7.40 

7.23 

7.14 

7.09 

6.99 

6.88 

7 

6.62 

6.31 

6.16 

5.99 

5.91 

5.86 

5.75 

5.65 

S 

5.81 

5.52 

5.36 

5.20 

5.12 

5.07 

4.96 

4.86 

9 

5.26 

4.96 

4.81 

4.65 

4.57 

4.52 

4.42 

4.31 

10 

4.85 

4.56 

4.41 

4.25 

4.17 

4.12 

4.01 

3.91 

11 

4.54 

4.25 

4.10 

3.94 

3.86 

3.81 

3.71 

3.60 

12 

4.30 

4.01 

3.86 

3.70 

3.62 

3.57 

3.47 

3.36 

13 

4.10 

3.82 

3.66 

3.51 

3.43 

3.38 

3.27 

3.17 

14 

3.94 

3.66 

3.51 

3.35 

3.27 

3.22 

3.11 

3.00 

15 

3.80 

3.52 

3.37 

3.21 

3.13 

3.08 

2.98 

2.87 

16 

3.69 

3.41 

3.26 

3.10 

3.02 

2.97 

2.86 

2.75 

17 

3.59 

3.31 

3.16 

3.00 

2.92 

2.87 

2.76 

2.65 

18 

3.51 

3.23 

3.08 

2.92 

2.84 

2.78 

2.68 

2.57 

19 

3.43 

3.15 

3.00 

2.84 

2.76 

2.71 

2.60 

2.49 

20 

3.37 

3.09 

2.94 

2.78 

2.69 

2.64 

2.54 

2.42 

22 

3.26 

2.98 

2.83 

2.67 

2.58 

2.53 

2.42 

2.31 

24 

3.17 

2.89 

2.74 

2.58 

2.49 

2.44 

2.33 

2.21 

26 

3.09 

2.81 

2.66 

2.50 

2.42 

2.36 

2.25 

2.13 

28 

3.03 

2.75 

2.60 

2.44 

2.35 

2.30 

2.19 

2.06 

30 

2.98 

2.70 

2.55 

2.39 

2.30 

2.25 

2.13 

2.01 

32 

2.93 

2.65 

2.50 

2.34 

2.25 

2.20 

2.08 

1.96 

34 

2.89 

2.61 

2.46 

2.30 

2.21 

2.16 

2.04 

1.91 

36 

2.86 

2.58 

2.43 

2.26 

2.18 

2.12 

2.00 

1.87 

38 

2.83 

2.55 

2.40 

2.23 

2.14 

2.09 

1.97 

1.84 

40 

2.80 

2.52 

2.37 

2.20 

2.11 

2.06 

1.94 

1.80 

50 

2.70 

2.42 

2.27 

2.10 

2.01 

1.95 

1.82 

1.68 

60 

2.63 

2.35 

2.20 

2.03 

1.94 

1.88 

1.75 

1.60 

70 

2.59 

2.31 

2.15 

1.98 

1.89 

1.83 

1.70 

1.54 

80 

2.55 

2.27 

2.12 

1.94 

1.85 

1.79 

1.65 

1.49 

90 

2.52 

2.24 

2.09 

1.92 

1.82 

1.76 

1.62 

1.46 

100 

2.50 

2.22 

2.07 

1.89 

1.80 

1.74 

1.60 

1.43 

150 

2.44 

2.16 

2.00 

1.83 

1.73 

1.66 

1.52 

1.33 

200 

2.41 

2.13 

1.97 

1.79 

1.69 

1.63 

1.48 

1.28 

1000 

2.34 

2.06 

1.90 

1.72 

1.61 

1.54 

1.38 

Lit 

X 

2.32 

2.04 

1.88 

1.70 

1.59 

1.52 

1.36 

1.00 
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APP. 5 Tables 


Table A12 Distribution Function F{x) = P[T ^ x) of the Random Variable T 
in Section 2S.8 


.V 

n 

= 11 


0. 

8 

001 

9 

002 

10 

003 

11 

005 

12 

008 

13 

013 

14 

020 

15 

030 

16 

043 

17 

060 

18 

082 

19 

109 

20 

141 

21 

179 

22 

223 

23 

271 

24 

324 

25 

381 

26 

440 

27 

500 


X 

n 

= 10 


0. 

6 

001 

7 

002 

8 

005 

9 

008 

10 

014 

11 

023 

12 

036 

13 

054 

14 

078 

15 

108 

16 

146 

17 

190 

18 

242 

19 

300 

20 

364 

21 

431 

22 

500 


X 

n 

=8 


0 . 

2 

001 

3 

003 

4 

007 

5 

016 

6 

031 

7 

054 

8 

089 

9 

138 

10 

199 

11 

274 

12 

360 

13 

452 


x 

n 

=9 


0. 

4 

001 

5 

003 

6 

006 

7 

012 

8 

022 

9 

038 

10 

060 

II 

090 

12 

130 

13 

179 

14 

238 

15 

306 

16 

381 

17 

460 


X 

n 

=7 


0. 

1 

001 

2 

005 

3 

015 

4 

035 

5 

068 

6 

119 

7 

191 

8 

281 

9 

386 

10 

i 

500 


X 

n 

=6 


0 . 

0 

001 

1 

008 

2 

028 

3 

068 

4 

136 

5 

235 

6 

360 

7 

500 


X 

n 

=5 


0. 

0 

008 

1 

042 

2 

117 

3 

242 

4 

408 



n 

X 

=4 


0 . 

0 

042 

1 

167 

2 

375 



n 

.V 

=3 


0. 

0 

167 

1 

500 


X 

n 

=20 


0. 

50 

001 

51 

002 

52 

002 

53 

003 

54 

004 

55 

005 

56 

006 

57 

007 

58 

008 

59 

010 

60 

012 

61 

014 

62 

017 

63 

020 

64 

023 

65 

027 

66 

032 

67 

037 

68 

043 

69 

049 

70 

056 

71 

064 

72 

073 

73 

082 

74 

093 

75 

104 

76 

117 

77 

130 

78 

144 

79 

159 

80 

176 

81 

193 

82 

211 

83 

230 

84 

250 

85 

271 

86 

293 

87 

315 

88 

339 

89 

362 

90 

387 

91 

411 

92 

436 

93 

462 

94 

487 


X 

n 

= 19 


0. 

43 

001 

44 

002 

45 

002 

46 

003 

47 

003 

48 

004 

49 

005 

50 

006 

51 

008 

52 

010 

53 

012 

54 

014 

55 

017 

56 

021 

57 

025 

58 

029 

59 

034 

60 

040 

61 

047 

62 

054 

63 

062 

64 

072 

65 

082 

66 

093 

67 

105 

68 

119 

69 

133 

70 

149 

71 

166 

72 

184 

73 

203 

74 

223 

75 

245 

76 

267 

77 

290 

78 

314 

79 

339 

80 

365 

81 

391 

82 

418 

83 

445 

84 

473 

85 

500 


X 

n 

= 18 


0. 

38 

001 

39 

002 

40 

003 

41 

003 

42 

004 

43 

005 

44 

007 

45 

009 

46 

011 

47 

013 

48 

016 

49 

020 

50 

024 

51 

029 

52 

034 

53 

041 

54 

048 

55 

056 

56 

066 

57 

076 

58 

088 

59 

100 

60 

115 

61 

130 

62 

147 

63 

165 

64 

184 

65 

205 

66 

227 

67 

250 

68 

275 

69 

300 

70 

327 

71 

354 

72 

383 

73 

411 

74 

441 

75 

470 

76 

500 


it 

n 

= 17 


0. 

32 

001 

33 

002 

34 

002 

35 

003 

36 

004 

37 

005 

38 

007 

39 

009 

40 

011 

41 

014 

42 

017 

43 

021 

44 

026 

45 

032 

46 

038 

47 

046 

48 

054 

49 

064 

50 

076 

51 

088 

52 

102 

53 

118 

54 

135 

55 

154 

56 

174 

57 

196 

58 

220 

59 

245 

60 

271 

61 

299 

62, 

328 

63 

358 

64 

388 

65 

420 

66 

452 

67 

484 


X 

n 

= 16 


0. 

27 

001 

28 

002 

29 

002 

30 

003 

31 

004 

32 

006 

33 

008 

34 

010 

35 

013 

36 

016 

37 

021 

38 

026 

39 

032 

40 

039 

41 

048 

42 

058 

43 

070 

44 

083 

45 

097 

46 

114 

47 

133 

48 

153 

49 

175 

50 

199 

51 

225 

52 

253 

53 

282 

54 

313 

55 

345 

56 

378 

57 

412 

oo 

447 

59 

482 


X 

n 

= 15 


0. 

23 

001 

24 

002 

25 

003 

26 

004 

27 

006 

28 

008 

29 

010 

30 

014 

31 

018 

32 

023 

33 

029 

34 

037 

35 

046 

36 

057 

37 

070 

38 

084 

39 

101 

40 

120 

41 

141 

42 

164 

43 

190 

44 

218 

45 

248 

46 

279 

47 

313 

48 

349 

49 

385 

50 

423 

51 

461 

52 

500 


X 

n 

= 14 


0 . 

18 

001 

19 

002 

20 

002 

21 

003 

22 

005 

23 

007 

24 

010 

25 

013 

26 

018 

27 

024 

28 

031 

29 

040 

30 

051 

31 

063 

32 

079 

33 

096 

34 

117 

35 

140 

36 

165 

37 

194 

38 

225 

39 

259 

40 

295 

41 

334 

42 

374 

43 

415 

44 

457 

45 

500 


X 

n 

= 13 


0. 

14 

001 

15 

001 

16 

002 

17 

003 

18 

005 

19 

007 

20 

011 

21 

015 

22 

021 

23 

029 

24 

038 

25 

050 

26 

064 

27 

082 

28 

102 

29 

126 

30 

153 

31 

184 

32 

218 

33 

255 

34 

295 

35 

338 

36 

383 

37 

429 

38 

476 


X 

n 

= 12 


0. 

11 

001 

12 

002 

13 

003 

14 

004 

15 

007 

16 

010 

17 

016 

18 

022 

19 

031 

20 

043 

21 

058 

22 

076 

23 

098 

24 

125 

25 

155 

26 

190 

27 

230 

28 

273 

29 

319 

30 

369 

31 

420 

32 

473 
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Fehlberg 894 
Fibonacci numbers 683 
Field 

conservative 415, 428 
of force 385 

gravitational 385, 407, 41 1, 587 
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Field ( Cant .) 

irrotational 415, 765 
scalar 384 
vector 384 
velocity 385 

Finite complex plane 710 
First 

fundamental form 457 
Green’s formula 466 
shifting theorem 224 
Fisher, R. A. 1047, 1066, 1077 
F-distribution 1066, A 102 
Fixed 

decimal point 781 
point 736, 781, 787 
Flat spring 68 
Floating point 781 
Flow augmenting path 975 
Flows in networks 973-981 
Fluid flow 412, 463, 761 
Flux 412, 450 
integral 450 

Folium of Descartes 399 
Forced oscillations 84, 499 
Ford-Fulkerson algorithm 979 
Forest 970 
Form 

Hermitian 361 
quadratic 353 
skew-Hermitian 361 
Forward 

differences 804 
edge 974, 976 
Four-color theorem 987 
Fourier 477 

-Bessel series 213, 583 
coefficients 480, 487 
coefficients, complex 497 
constants 210 
cosine integral 511 
cosine series 491 
cosine transform 514, 529 
double series 576 
half-range expansions 494 
integral 508, 563 
integral, complex 519 
-Legendre series 212, 590 
matrix 525 
series 211, 480, 487 
series, complex 497 


Fourier (Cont.) 

series, generalized 210 
sine integral 511 
sine series 491, 543 
sine transform 514, 530 
transform 5 1 9, 53 1 , 565 
transform, discrete 525 
transform, fast 526 
Fractional linear transformation 734 
Fraction defective 1073 
Fredholm 201 
Free 

fall 18 

oscillations 61 
Frenet formulas 400 
Frequency 63 

of values in samples 994 
Fresnel integrals 690, A65 
Friction 18-19 
Frobenius 182 
method 182 
norm 849 
theorem 869 
Fulkerson 979 
Full- wave rectifier 248 
Function 

analytic 175, 617 

Bessel 191, 198, 202, 207, A94 

beta A64 

bounded 38 

characteristic 542, 574 

complex 614 

conjugate harmonic 622 

entire 624, 661,711 

error 568, 690, A64, A95 

even 490 

exponential 57, 623, A60 
factorial 192, 1008, A95 
gamma 192, A95 
Hankel 202 

harmonic 465, 622, 772 
holomorphic 617 
hyperbolic 628, 743, A62 
hypergeometric 188 
inverse hyperbolic 634 
inverse trigonometric 634 
Legendre 177 
logarithmic 630, A60 
meromorphic 711 
Neumann 201 
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Function ( Cont .) 
odd 490 

orthogonal 205, 482 
orthonormal 205, 210 
periodic 478 
probability 1012, 1033 
rational 617 
scalar 384 
space 382 
staircase 248 
step 234 

trigonometric 626, 688, A60 
unit step 234 
vector 384 
Function space 327 
Fundamental 
form 457 
matrix 139 
mode 542 
period 485 

system 49, 106, 113, 138 
theorem of algebra 662 

G 

Galilei 15 

Gamma function 192, A95 
GAMS 778 
Gauss 188 

distribution 1026 
divergence theorem 459 
elimination method 289, 834 
hypergeometric equation 188 
integration formula 826 
-Jordan elimination 317, 844 
least squares 860, 1084 
quadrature 826 
-Seidel iteration 846, 913 
General 

powers 632 

solution 64, 106, 138, 159 
Generalized 

Fourier series 210 
function 242 
solution 545 
triangle inequality 608 
Generating function 181, 216, 258 
Geometric 

multiplicity 337 

series 167, 668, 673, 687, 692 


Gerschgorin’s theorem 866 
Gibbs phenomenon 490, 510 
Global error 887 
Golden Rule 15, 23 
Goodness of fit 1076 
Gosset 1066 
Goursat 648, A88 
Gradient 403, 415, 426, A72 
method 938 
Graph 955 

bipartite 982 
complete 958 
Euler 963 
planar 987 
sparse 957 
weighted 959 

Gravitation 385, 407, 411, 587 
Greedy algorithm 967 
Greek alphabet: Back cover 
Green 439 

formulas 466 
theorem 439, 466 
Gregory-Newton formulas 805 
Growth restriction 225 
Guldin’s theorem 458 


H 

Hadamard’s formula 676 
Half-life time 9 
Half-plane 613 
Half-range Fourier series 494 
Half-wave rectifier 248, 489 
Halving 819, 824 
Hamiltonian cycle 960 
Hanging cable 198 
Hankel functions 202 
Hard spring 159 
Harmonic 

conjugate 622 
function 465, 622, 772 
oscillation 63 
series 670 
Heat 

equation 464, 536, 553, 757, 923 
potential 758 
Heaviside 221 
expansions 245 
formulas 247 
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Heaviside ( Cont .) 

function 234 
Helicoid 449 
Helix 391, 394, 399 
Helmholtz equation 572 
Henry 92 
Hermite 

interpolation 816 
polynomials 216 
Hermitian 357, 361 
Hertz 63 

Hesse’s normal form 375 
Heun’s method 890 
High-frequency line equations 594 
Hilbert 201,326 
matrix 858 
space 326 
Histogram 994 
Holomorphic 617 
Homogeneous 

differential equation 27, 46, 105, 535 
system of equations 288, 304, 833 
Hooke’s law 62 

Householder’s tridiagonalization 875 
Hyperbolic 

differential equation 551, 909, 928 
functions, complex 628, 743 
functions, real A62 
paraboloid 448 

partial differential equations 551, 928 
spiral 399 
Hypergeometric 

differential equation 188 
distribution 1024 
functions 188 
series 188 
Hypocycloid 399 
Hypothesis 1058 

I 

Idempotent matrix 286 
Identity 

of Lagrange 383 
matrix (see Unit matrix) 
theorem for power series 679 
transformation 736 
trick 351 

Ill-conditioned 851 
Image 327, 729 


Imaginary 
axis 604 
part 602 
unit 602 

Impedance 94, 98 
Implicit solution 20 
Improper integral 222, 719, 722 
Impulse 242 
IMSL 778 
Incidence 
list 957 
matrix 958 

Inclusion theorem 868 
Incomplete gamma function A64 
Incompressible 765 
Inconsistent equations 292, 303 
Increasing sequence A69 
Indefinite 

integral 637, 650 
integration 640 

Independence of path 426, 648 
Independent 
events 1004 
random variables 1036 
Indicial equation 184 
Indirect method 845 
Inductance 92 
Inductor 92 
Inequality 

Bessel 215, 504 
Cauchy 660 
ML- 644 
Schur 869 

triangle 326, 372, 608 
Infinite 

dimensional 325 
population 1025, 1045 
sequence 664 
series ( see Series) 

Infinity 710, 736 
Initial 

condition 6, 48, 137, 540 
value problem 6, 38, 48, 107, 886, 902 
Injective mapping 729 
Inner product 325, 359, 371 
space 326 
Input 26, 84, 230 
Instability (see Stability) 

Integral 

contour 647 



Index 


111 


Integral {Com.) 
definite 639 
double 433 
equation 252 
Fourier 508, 563 
improper 719, 722 
indefinite 637, 650 
line 421, 633 
surface 449 

theorems, complex 647, 654 
theorems, real 439, 453, 469 
transform 221, 513 
triple 458 

Integrating factor 23 
Integration 

complex functions 637-663, 701-727 

Laplace transforms255 

numeric 817-827 

power series 680 

series 695 

Integro-differential equation 92 
Interest 9, 33 
Interlacing of zeros 197 
Intermediate value theorem 796 
Interpolation 797-815 
Hermite 816 
Lagrange 798 
Newton 802, 805, 807 
spline 811 

Interquartile range 995 
Intersection of events 998 
Interval 

closed A69 

of convergence 172, 676 
estimate 1046 
open 4, A69 
Invariant subspace 865 
Inverse 

hyperbolic functions 634 
mapping principle 733 
of a matrix 315, 844 
trigonometric functions 634 
Inversion 735 
Investment 9, 33 
Irreducible 869 
Irregular boundary 919 
Irrotational 415, 765 
Isocline 10 

Isolated singularity 707 
Isotherms 758 


Iteration 

for eigenvalues 872 
for equations 787-794 
Gauss-Seidel 846, 913 
Jacobi 850 
Picard 41 

J 

Jacobian 436, 733 
Jacobi iteration 850 
Jerusalem, Shrine of the Book 814 
Jordan 316 
Joukowski airfoil 732 

K 

Kirchhoffs laws 92, 973 
Kronecker delta 210, A83 
Kruskal’s algorithm 967 
Kutta 892 

L 

^2> 853 

L 2 863 
Labeling 968 
Lagrange 50 

identity of 383 
interpolation 798 
Laguerre polynomials 209, 257 
Lambert’s law 43 
LAPACK 778 
Laplace 221 

equation 407, 465, 536, 579, 587, 910 
integrals 512 
limit theorem 1031 
operator 408 
transform 221, 594 
Laplacian 443, A73 
Latent root 324 
Laurent series 701, 712 
Law of 

absorption 43 
cooling 14 
gravitation 385 
large numbers 1032 
mass action 43 

the mean (see Mean value theorem) 
LC-circuit 97 
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LCL 1068 

Least squares 860, 1084 
Lebesgue 863 
Left-hand 

derivative 484 
limit 484 
Left-handed 378 
Legendre 177 

differential equation 177, 204, 590 
functions 177 

polynomials 179, 207, 590, 826 
Leibniz 14 

convergence test A70 
Length 

of a curve 393 
of a vector 365 
Leonardo of Pisa 638 
Leontief 344 
Leslie model 341 
Libby 13 

Liebmann’s method 913 
Likelihood function 1047 
Limit 

of a complex function 615 

cycle 157 

left-hand 484 

point A90 

right-hand 484 

of a sequence 664 

vector 386 

of a vector function 387 
Lineal element 9 
Linear 

algebra 27 1-363 
combination 106, 325 
dependence 49, 74, 106, 108, 297, 325 
differential equation 26, 45, 105, 535 
element 394, A72 
fractional transformation 734 
independence 49, 74, 106, 108, 297, 325 
interpolation 798 
operator 60 
optimization 939 
programming 939 
space (see Vector space) 
system of equations 287, 833 
transformation 281, 327 
Linearization of systems of ODEs 151 
Line Integral 421, 633 
Lines of force 751 


LINPACK 779 
Liouville 203 
theorem 661 
Lipschitz condition 40 
List 957 
Ljapunov 148 
Local 

error 887 
minimum 937 
Logarithm 630, 688, A60 
Logarithmic 

decrement 69 
integral A66 
spiral 399 

Logistic population law 30 

Longest path 959 

Loss of significant digits 785 

Lotka-Volterra population model 154 

Lot tolerance per cent defective 1075 

Lower 

control limit 1068 
triangular matrix 283 
LTPD 1075 
LU-factorization 841 


M 

Maclaurin 683 
series 683 
trisectrix 399 

Magnitude of a vector (see Length) 

Main diagonal 274, 309 
Malthus’s law 5, 31 
MAPLE 779 
Mapping 327, 729 
Marconi 63 

Marginal distributions 1035 
Markov process 285, 341 
Mass-spring system 61, 86, 135, 150, 243, 252, 
261, 342, 499 
Matching 985 
MATHCAD 779 
MATHEMATICA 779 
Mathematical expectation 1019, 1038 
MATLAB 779 
Matrix 

addition 275 
augmented 288, 833 
band 914 
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Matrix ( Cont .) 
diagonal 284 

eigenvalue problem 333-363, 863-882 

Hermitian 357 

identity ( see Unit matrix) 

inverse 315 

inversion 315, 844 

multiplication 278, 321 

nonsingular 315 

norm 849, 854 

normal 362, 869 

null (see Zero matrix) 

orthogonal 345 

polynomial 865 

scalar 284 

singular 315 

skew-Hermitian 357 

skew-symmetric 283, 345 

sparse 812, 912 

square 274 

stochastic 285 

symmetric 283, 345 

transpose 282 

triangular 283 

tridiagonal 812, 875, 914 

unit 284 

unitary 357 

zero 276 

Max-flow min-cut theorem 978 
Maximum 937 
flow 979 

likelihood method 1047 
matching 983 
modulus theorem 772 
principle 773 
Mean convergence 214 
Mean-square convergence 214 
Mean value of a (an) 
analytic function 77 1 
distribution 1016 
function 764 
harmonic function 772 
sample 996 

Mean value theorem 402, 434, 454 
Median 994, 1081 
Membrane 569-586 
Meromorphic function 711 
Mesh incidence matrix 278 
Method of 

false position 796 


Method of (Cont.) 

least squares 860, 1084 
moments 1046 
steepest descent 938 
undetermined coefficients 78, 1 17, 160 
variation of parameters 98, 1 18, 160 
Middle quartile 994 
Minimum 937, 942, 946 
MINITAB 991 
Minor 309 
Mixed 

boundary value problem 558, 587, 759, 917 
triple product 381 

Mixing problem 13, 130, 146, 163, 259 
Mks system: Front cover 
ML-inequality 644 
Mobius 453 

strip 453, 456 
transformation 734 
Mode 542, 582 

Modeling 2, 6, 13, 61, 84, 130, 159, 222, 340, 499, 
538, 569, 750-767 
Modified Bessel functions 203 
Modulus 607 
Molecule 912 
Moivre’s formula 610 
Moment 

central 1019 
of a distribution 1019 
of a force 380 
generating function 1026 
of inertia 436, 455, 457 
of a sample 1046 
vector 380 

Monotone sequence A69 
Moore’s shortest path algorithm 960 
Morera’s theorem 661 
Moulton 900 

Moving trihedron (see Trihedron) 

M-test for convergence 969 
Multinomial distribution 1025 
Multiple point 39 1 
Multiplication of 

complex numbers 603, 609 
determinants 322 
matrices 278, 321 
means 1038 
power series 1 74, 680 
vectors 219, 371, 377 
Multiplication rule for events 1003 
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Multiplicity 337, 865 
Multiply connected domain 646 
Multistep method 898 
“Multivalued function” 615 
Mutually exclusive events 998 

N 

Nabla 403 
NAG 779 
Natural 

frequency 63 
logarithm 630, A60 
spline 812 

Neighborhood 387, 613 
Nested form 786 
NETLIB 779 

Networks 132, 146, 162, 244, 260, 263, 277, 331 
in graph theory 973 
Neumann, C. 201 
functions 201 
problem 558, 587, 917 
Newton 14 

-Cotes formulas 822 
interpolation formulas 802, 805, 807 
law of cooling 14 
law of gravitation 385 
method 800 
-Raphson method 800 
second law 62 
Neyman 1049, 1058 
Nicolson 924 
Nicomedes 399 
Nilpotent matrix 286 
NIST 779 
Nodal 

incidence matrix 277 
line 574 
Node 142, 797 
Nonbasic variables 945 
Nonconservative 428 
Nonhomogeneous 

differential equation 27, 78, 1 16, 159, 305, 535 
system of equations 288, 304 
Nonlinear differential equations 45, 151 
Nonorientable surface 453 
Nonparametric test 1080 
Nonsingular matrix 315 
Norm 205, 326, 346, 359, 365, 849 


Normal 

acceleration 395 
asymptotically 1057 
to a curve 398 
derivative 444, 465 

distribution 1026, 1047-1057, 1062-1067, A98 
two-dimensional 1090 
equations 860, 1086 
form of a PDE 551 
matrix 362, 869 
mode 542, 582 
plane 398 
to a plane 375 
random variable 1026 
to a surface 447 
vector 375, 447 

Null 

hypothesis 1058 
matrix ( see Zero matrix) 
space 301 

vector {see Zero vector) 

Nullity 301 

Numeric methods 777-934 
differentiation 827 
eigenvalues 863-882 
equations 787-796 
integration 817-827 
interpolation 797-816 
linear equations 833-858 
matrix inversion 315, 844 
optimization 936-953 
ordinary differential equations (ODEs) 

886-908 

partial differential equations (PDEs) 909-930 
Nystrom method 906 

O 

0 962 

Objective function 936 
OC curve 1062 
Odd function 490 

ODE 4 ( see also Differential equations) 

Ohm’s law 92 
One-dimensional 
heat equation 553 
wave equation 539 
One-sided 

derivative 484 
test 1060 
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One-step method 898 
One-to-one mapping 729 
Open 

disk 613 

integration formula 827 
interval A69 
point set 613 

Operating characteristic 1062 
Operational calculus 59, 220 
Operation count 838 
Operator 59, 327 

Optimality principle, Bellman’s 963 
Optimal solution 942 
Optimization 936-953, 959-990 
Orbit 141 
Order 887, 962 

of a determinant 308 
of a differential equation 4, 535 
of an iteration process 793 
Ordering 969 

Ordinary differential equations 2-269, 886-908 
(see also Differential equation) 

Orientable surface 452 
Orientation of a 
curve 638 
surface 452 
Orthogonal 

coordinates A71 
curves 35 
eigenvectors 350 
expansion 210 
functions 205, 482 
matrix 345 
polynomials 209 
series 210 
trajectories 35 
transformation 346 
vectors 326, 371 
Orthonormal functions 205, 210 
Oscillations 

of a beam 547, 552 
of a cable 198 
in circuits 91 
damped 64, 88 
forced 84 
free 61, 500, 547 
harmonic 63 

of a mass on a spring 61, 86, 135, 150, 243, 
252, 261, 342, 499 
of a membrane 569-586 


Oscillations ( Cont .) 
self-sustained 157 
of a string 204, 538, 929 
undamped 62, 

Osculating plane 398 
Outcome 997 
Outer product 377 
Outlier 995 
Output 26, 230 
Overdamping 65 
Overdetermined system 292 
Overflow 782 
Overrelaxation 851 
Overtone 542 


P 

Paired comparison 1065 

Pappus’s theorem 458 

Parabolic differential equation 551, 922 

Paraboloid 448 

Parachutist 12 

Parallelepiped 382 

Parallel flow 766 

Parallelogram 

equality 326, 372, 612 
law 367 

Parameter of a distribution 1016 
Parametric representation 389, 446 
Parking problem 1023 
Parse val’s equality 215, 504 
Partial 

derivative 388, A66 
differential equation 535, 909 
fractions 231, 245 
pivoting 291, 834 
sum 171,480, 666 
Particular solution 6, 48, 106, 159 
Pascal 399 
Path 

in a digraph 974 
in a graph 959 
of integration 421, 637 

PDE 535, 909 (see also Differential equation) 

Peaceman-Rachford method 915 

Pearson, E. S. 1058 

Pearson, K. 1066 

Pendulum 68, 152, 156 

Period 478 
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Periodic 

extension 494 
function 478 
Permutation 1006 

Perron-Frobenius theorem 344, 869 

Pfaffian form 429 

Phase 

angle 88 

of complex number (see Argument) 
lag 88 

plane 141, 147 
portrait 141, 147 
Picard 

iteration method 41 
theorem 709 
Piecewise 

continuous 226 
smooth 421, 448, 639 
Pivoting 291, 834 
Planar graph 987 
Plane 315, 375 
Plane curve 391 
Poincare 216 
Point 

estimate 1046 
at infinity 710, 736 
set 613 
source 765 
spectrum 507, 524 
Poisson 769 

distribution 1022, 1073, A97 
equation 910, 918 
integral formula 769 
Polar 

coordinates 437, 443, 580, 607-608 
form of complex numbers 607 
moment on inertia 436 
Pole 708 
Polynomial 

approximation 797 
matrix 865 

Polynomially bounded 962 
Polynomials 617 
Chebyshev 209 
Hermite 216 
Laguerre 207, 257 
Legendre 179, 207, 590, 826 
trigonometric 502 
Population in statistics 1044 
Population models 31, 154, 341 


Position vector 366 
Positive definite 326, 372 
Possible values 1012 
Postman problem 963 
Potential 407, 427, 590, 750, 762 
complex 763 
theory 465, 749 
Power 

method for eigenvalues 872 
of a test 1061 
series 167, 673 
series method 167 
Precision 782 
Predator-prey 154 
Predictor-corrector 890, 900 
Pre-Hilbert space 326 
Prim’s algorithm 971 
Principal 

axes theorem 354 
branch 632 

diagonal (see main diagonal) 
directions 340 
normal (Fig. 210) 397 
part 708 

value 607, 630, 632, 719, 722 
Prior estimate 794 
Probability 1000, 1001 
conditional 1003 
density 1014, 1034 
distribution 1010, 1032 
function 1012, 1033 
Producer’s risk 1075 
Product (see Multiplication) 
Projection of a vector 374 
Pseudocode 783 
Pure imaginary number 603 

Q 

QR-factorization method 879 
Quadratic 

equation 785 
form 353 
interpolation 799 

Qualitative methods 124, 139-165 

Quality control 1068 

Quartile 995 

Quasilinear 551, 909 

Quotient of complex numbers 604 



Index 


117 


R 

Rachford method 915 
Radiation 7, 561 
Radiocarbon dating 13 
Radius 

of convergence 172, 675, 686 
of a graph 973 
Random 

experiment 997 
numbers 1045 
variable 1010, 1032 
Range of a 

function 614 
sample 994 

Rank of a matrix 297, 299, 31 1 
Raphson 790 
Rational function 617 
Ratio test 669 
Rayleigh 159 
equation 159 
quotient 872 
RC - circuit 97, 237, 240 
Reactance 93 
Real 

axis 604 
part 602 
sequence 664 
vector space 324, 369 
Rectangular 

membrane 57 1 
pulse 238, 243 
rule 817 

wave 211, 480, 488, 492 
Rectifiable curve 393 
Rectification of a lot 1075 
Rectifier 248, 489, 492 
Rectifying plane 398 
Reduction of order 50, 116 
Region 433, 614 
Regression 1083 

coefficient 1085, 1088 
line 1084 
Regula falsi 796 
Regular 

point of an ODE 1 83 
Sturm-Liouville problem 206 
Rejectable quality level 1075 
Rejection region 1060 
Relative 

class frequency 994 


Relative {Corn) 
error 784 
frequency 1000 
Relaxation 850 
Remainder 171, 666 
Removable singularity 709 
Representation 328 
Residual 849, 852 
Residue 713 
Residue theorem 715 
Resistance 91 
Resonance 86 
Response 28, 84 
Restoring force 62 
Resultant of forces 367 
Riccati equation 34 
Riemann 618 
sphere 710 
surface 746 
Right-hand 

derivative 484 
limit 484 
Right-handed 377 
Risk 1095 
/?L-circuit 97, 240 
RLC - circuit 95, 241, 244, 499 
Robin problem 558, 587 
Rodrigues’s formula 181 
Romberg integration 829 
Root 610 
Root test 671 

Rotation 381, 385, 414, 734, 764 

Rounding 782 

Row 

echelon form 294 
-equivalent 292, 298 
operations 292 
scaling 838 
space 300 
sum norm 849 
vector 274 

Runge-Kutta-Fehlberg 893 
Runge-Kutta methods 892, 904 
Runge-Kutta-Nystrom 906 

S 

Saddle point 143 
Sample 997, 1045 
covariance 1085 
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Sample ( Com .) 

distribution function 1076 

mean 996, 1045 

moments 1046 

point 997 

range 994 

size 997, 1045 

space 997 

standard deviation 996, 1046 
variance 996, 1045 
Sampling 1004, 1023, 1073 
SAS 991 

Sawtooth wave 248, 493, 505 
Scalar 276, 364 
field 384 
function 384 
matrix 284 

multiplication 276, 368 
triple product 381 
Scaling 838 
Scheduling 987 
Schoenberg 810 
Schrodinger 242 
Schur’s inequality 869 
Schwartz 242 
Schwarz inequality 326 
Secant 627, A62 
method 794 
Second 

Green’s formula 466 
shifting theorem 235 
Sectionally continuous 

( see Piecewise continuous) 

Seidel 846 
Self-starting 898 
Self-sustained oscillations 157 
Separable differential equation 12 
Separation of variables 12, 540 
Sequence 664, A69 
Series 666, A69 
addition of 680 
of Bessel functions 213, 583 
binomial 689 
convergence of 171, 666 
differentiation of 174, 680 
double Fourier 576 
of eigenfunctions 210 
Fourier 211, 480, 487 
geometric 167, 668, 673, 687, 692 
harmonic 670 


Series (Cent.) 

hypergeometric 188 
infinite 666, A70 
integration of 680 
Laurent 701, 712 
Maclaurin 683 
multiplication of 174, 680 
of orthogonal functions 210 
partial sums of 171, 666 
power 167, 673 
real A69 

remainder of 171, 666, 684 
sum of 171, 666 
Taylor 683 
trigonometric 479 
value of 171, 666 
Serret-Frenet formulas 
{see Frenet formulas) 

Set of points 613 
Shifted data problem 232 
Shifting theorems 224, 235, 528 
Shortest 
path 959 

spanning tree 967 
Shrine of the Book 814 
Sifting 242 

Significance level 1059 
Significant 
digit 781 
in statistics 1059 
Sign test 1081 
Similar matrices 350 
Simple 

curve 391 
event 997 
graph 955 
pole 708 
zero 709 
Simplex 

method 944 
table 945 

Simply connected 640, 646 
Simpson’s rule 821 
Simultaneous 

corrections 850 

differential equations 124 

linear equations {see Linear systems) 

Sine 

of a complex variable 627, 688, 742 
hyperbolic 688 
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Sine ( Cont .) 

integral 509, 690, A65, A95 
of a real variable A60 
Single precision 782 
Single- valued relation 615 
Singular 

at infinity 7 1 1 
matrix 315 
point 183, 686, 707 
solution 8, 50 

Sturm-Liouville problem 206 
Singularity 686, 707 
Sink 464, 765, 973 
SI system: Front cover 
Size of a sample 997, 1045 
Skew-Hermitian 357, 361 
Skewness 1020 

Skew-symmetric matrix 283, 345 

Skydiver 12 

Slack variable 941 

Slope field 9 

Smooth 

curve 421, 638 
piecewise 421, 448, 639 
surface 448 
Sobolev 242 
Soft spring 159 
Software 778, 991 
Solution 

of a differential equation 4, 46, 105, 536 

general 6, 48, 106, 138 

particular 6, 48, 106 

singular 8, 50 

space 304 

steady-state 88 

of a system of differential equations 1 36 
of a system of equations 288 
vector 288 
SOR 851 
Sorting 969, 993 
Source 464, 765, 973 
Span 300 
Spanning tree 967 
Sparse 

graph 957 
matrix 812, 912 
system of equations 846 
Spectral 

density 520 

mapping theorem 344, 865 


Spectral (Cont.) 
radius 848 
representation 520 
shift 344, 865, 874 
Spectrum 324, 542, 864 
Speed 394 
Sphere 446 

Spherical coordinates 588, A71 
Spiral 399 
point 144 
Spline 81 1 
S-PLUS, SPSS 992 
Spring 62 
Square 

error 503 
matrix 274 
membrane 575 
root 792 

wave 21 1,480, 488,492 
Stability 31, 148, 783, 822, 922 
chart 148 

Stagnation point 763 
Staircase function 248 
Standard 

basis 328, 369 
deviation 1016 

form of a linear ODE 26, 45, 105 
Standardized random variable 1018 
Stationary point 937 
Statistical 

inference 1044 
tables A96-A106 
Steady 413, 463 
state 88 

Steepest descent 938 
Steiner 399, 457 
Stem-and-leaf plot 994 
Stencil 912 

Step-by-step method 886 
Step function 234 
Step size control 889 
Stereographic projection 7 1 1 
Stiff 

ODE 896 

system of ODEs 907 
Stirling formula 1008, A64 
Stochastic 
matrix 285 
variable 1011 
Stokes’s theorem 469 
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Index 


Straight line 375, 391 
Stream function 762 
Streamline 761 
Strength of a source 767 
Strictly diagonally dominant 868 
String 204, 538, 594 
Student’s distribution 1053, A 100 
Sturm-Liouville problem 203 
Subgraph 956 

Submarine cable equations 594 
Submatrix 302 

Subsidiary equation 220, 230 
Subspace 300 
invariant 865 
Success 1021 
Successive 

corrections 850 
overrelaxation 85 1 
Sum (see Addition) 

Sum of a series 171, 666 
Superlinear convergence 795 
Superposition principle 106, 138 
Surface 445 

area 435, 442, 454 
integral 449 
normal 406 

Surjective mapping 729 
Symmetric matrix 283, 345 
System of 

differential equations 124-165, 258-263, 902 
linear equations (see Linear system) 
units: Front cover 


Tables 

on differentiation: Front cover 
of Fourier transforms 529-531 
of functions A94-A106 
of integrals: Front cover 
of Laplace transforms 265-267 
statistical A96-A106 
Tangent 627, A62 
to a curve 392, 397 
hyperbolic 629, A62 
plane 406, 447 
vector 392 

Tangential acceleration 395 
Target 973 
Tarjan 971 


Taylor 683 
formula 684 
series 683 

Tchebichef (see Chebyshev) 
distribution 1053, A 100 
Telegraph equations 594 
Termination criterion 791 
Termwise 

differentiation 696 
integration 695 
multiplication 680 

Test 

chi-square 1077 
for convergence 667-672 
of hypothesis 1058-1068 
nonparametric 1080 
Tetrahedron 382 
Thermal 

conductivity 465, 552 
diffusivity 465, 552 
Three 

-eights rule 830 
-sigma limits 1028 
Timetabling 987 
Torricelli’s law 15 
Torsion of a curve 397 
Torsional vibrations 68 
Torus 454 
Total differential 19 
Trace of a matrix 344, 355, 864 
Trail 959 

Trajectories 35, 133, 141 
Transfer function 230 
Transformation 

of Cartesian coordinates A84 
by a complex function 729 
of integrals 437, 439, 459, 469 
linear 281 

linear fractional 734 
orthogonal 346 
similarity 350 
unitary 359 

of vector components A83 
Transient state 88 
Translation 365, 734 
Transmission line equations 593 
Transpose of a matrix 282 
Transpositions 1081, A 106 
Trapezoidal rule 817 
Traveling salesman problem 960 
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Tree 966 
Trend 1081 
Trial 997 

Triangle inequality 326, 372, 608 
Triangular matrix 283 
Tricomi equation 55 1 , 552 
Tridiagonalization 875 
Tridiagonal matrix 812, 875, 914 
Trigonometric 

approximation 502 
form of complex numbers 607, 624 
functions, complex 626, 688 
functions, real A60 
polynomial 502 
series 479 
system 479, 482 
Trihedron 398 
Triple 

integral 458 
product 381 

Trivial solution 27, 304 
Truncation error 783 
Tuning 543 
Twisted curve 391 
Two-dimensional 
distribution 1032 
normal distribution 1090 
random variable 1032 
wave equation 571 
Two-sided test 1060 
Type of a differential equation 551 
Type I and II errors 1060 

u 

UCL 1068 

Unconstrained optimization 937 
Uncorrelated 1090 
Undamped system 62 
Underdamping 66 
Underdetermined system 292 
Underflow 782 

Undetermined coefficients 79, 117, 160 
Uniform 

convergence 691 
distribution 1015, 1017, 1034 
Union of events 998 
Uniqueness 

differential equations 37, 73, 107, 137 
Dirichlet problem 774 


Uniqueness (Cont.) 

Laurent series 705 
linear equations 303 
power series 678 

Unit 

bi normal vector 398 
circle 611 
impulse 242 
matrix 284 
normal vector 447 
principal normal vector 398 
step function 234 
tangent vector 392, 398 
vector 326 
Unitary 

matrix 357 

system of vectors 359 
transformation 359 
Unstable ( see Stability) 

Upper control limit 1068 

V 

Value of a series 171, 666 
Vandermonde determinant 1 12 
Van der Pol equation 157 
Variable 

complex 614 
random 1010, 1032 
standardized random 1018 
stochastic 1011 
Variance of a 

distribution 1016 
sample 996, 1045 

Variation of parameters 98, 118, 160 
Vector 274, 364 

addition 276, 327, 367 
field 384 
function 384 
moment 380 
norm 853 
product 377 
space 300, 323, 369 
subspace 300 
Velocity 394 
field 385 
potential 762 
vector 394 
Venn diagram 998 
Verhulst 31 
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Vertex 955 

coloring 987 
exposed 983 
incidence list 957 
Vibrations {see Oscillations) 

Violin string 538 
Vizing’s theorem 987 
Volta 92 
Voltage drop 92 
Volterra 154, 201, 253 
Volume 435 
Vortex 767 
Vorticity 764 

W 

Waiting time problem 1013 
Walk 959 

Wave equation 536, 539, 569, 929 
Weber 217 

equation 217 


Weber ( Cont .) 

functions 201 
Website see Preface 
Weierstrass 618, 696 

approximation theorem 797 
M-test 696 
Weighted graph 959 
Weight function 205 
Well-conditioned 852 
Wessel 605 

Wheatstone bridge 296 
Work 373, 423 
integral 422 
Wronskian 75, 108 

Z 

Zero 

of analytic function 709 
matrix 276 
vector 367 



Systems of Units. Some Important Conversion Factors 

The most important systems of units are shown in the table below. The mks system is also known as 
the International System of Units (abbreviated SI), and the abbreviations s (instead of sec), 
g (instead of gm), and N (instead of nt) are also used. 


System of units 

Length 

Mass 

Time 

Force 

cgs system 

centimeter (cm) 

gram (gm) 

second (sec) 

dyne 

mks system 

meter (m) 

kilogram (kg) 

second (sec) 

newton (nt) 

Engineering system 

foot (ft) 

slug 

second (sec) 

pound (lb) 


1 inch (in.) = 2.540000 cm 1 foot (ft) = 12 in. = 30.480000 cm 

1 yard (yd) = 3 ft = 91.440000 cm 1 statute mile (mi) = 5280 ft = 1.609344 km 

1 nautical mile = 6080 ft = 1.853184 km 

1 acre = 4840 yd 2 = 4046.8564 m 2 1 mi 2 = 640 acres = 2.5899881 km 2 

1 fluid ounce = 1/128 U.S. gallon = 231/128 in. 3 = 29.573730 cm 3 
1 U.S. gallon = 4 quarts (liq) = 8 pints (liq) = 128 fl oz = 3785.4118 cm 3 
1 British Imperial and Canadian gallon = 1.200949 U.S. gallons = 4546.087 cm 3 
1 slug = 14.59390 kg 

1 pound (lb) = 4.448444 nt 1 newton (nt) = 10 5 dynes 

1 British thermal unit (Btu) = 1054.35 joules 1 joule = 10 7 ergs 

1 calorie (cal) = 4.1840 joules 
1 kilowatt-hour (kWh) = 3414.4 Btu = 3.6 • 10 6 joules 
1 horsepower (hp) = 2542.48 Btu/h = 178.298 cal/sec = 0.74570 kW 
1 kilowatt (kW) = 1000 watts = 3414.43 Btu/h = 238.662 cal/sec 

°F = °C • 1.8 + 32 1° = 60' = 3600" = 0.017453293 radian 


For further details see, for example, D. Halliday, R. Resnick, and J. Walker, Fundamentals of Physics. 7th ed.. New York: 
Wiley. 2005. See also AN American National Standard. ASTM/IEEE Standard Metric Practice, Institute of Electrical and 
Electronics Engineers, Inc., 445 Hoes Lane, Piscataway, N. J. 08854. 






















Differentiation 

(i cu) f = cu (e constant) 


(w + v) f — u + v' 


(uv) f = uv + v'u 


_ UV — VU 


v~ 


dii du dy 

— — — 7- (Chain rule) 

dx dy dx 


(x n Y = nx n ~ l 
(e x Y = e x 
(a x )' = a x In a 
(sin xY — cos a 
(cos a)' = sin x 
(tan x)' = sec 2 a 
( cot a)' = —esc 2 A 
(sinhA)' = cosh a 
(cosh a/ = sinhx 

1 


(In a)' = 


v/ ] °Sae 
(l0g a A) = 


(arcsinA)' = 


1 


(arccosA)' = - 


(arctan a) 7 = 


VHP 2 

1 


VPP 2 


1 + A 2 


Integration 

J uv' dx = uv — J uv dx 
r r n+1 

x n dx= - 7 + c (n*-l) 

J n + 1 

/ — dx = In |a| -I- c 

A 

fe ax dx = -e a * + c 

J a 

J sin a dx = —cos a + e 
J cos a dx = sin a + e 
Jtan a dx = -In |cos a| -h e 
fcotxdx = ln |sin a| + c 
J sec a dx = In |sec a + tan a| + c 

J esc x dx = In |cse a - cot a| + c 

f dx 1 A 

-o — — n = — arctan — he 
J aH a 2 a a 


dr 


= arcsin — l-c 


Va 2 - a l a 

dr a 

—7====== = arcsinh he 

Va 2 + a 2 a 

dx A 

— 7==z — arccosh - -he 
Va 2 - a 2 a 


j 

j 
1 

j sin 2 a dx = §a — | sin 2x -h e 
J cos 2 Adr = |A + |$in2A + e 
J tan 2 xdx = tan a — a -he 
J cot 2 xdx = —cot a — a -h e 
f In xdx = x In x — a H- c 
J e™ sin for dr 


a 2 + b 2 
Je ** cos bx dx 


(a sin bx — b cos foe) + e 


e z + fo 


(a cos for + & sin foe) + e 


(arccotA)' = 


1 

1 + A 2 



Some Constants 


Polar Coordinates 


<? = 2.71828 1828459045 23536 
Ve = 1.64872 12707 00128 14685 
e 2 = 7.38905 60989 30650 22723 


x = r cos 6 y = r sin 6 

r = Vx 2 + y 2 tan 0 = — 

x 


tt = 3.14159 26535 89793 23846 
tj 2 = 9.8696044010 89358 61883 
Vt 7 = 1.77245 38509 05516 02730 

logic w = 0.49714 98726 94133 85435 
In = 1.14472 98858 49400 17414 
logic e = 0.43429 44819 03251 82765 
In 10 = 2.30258 50929 94045 68402 

V2 = 1.41421 35623 73095 04880 
^2 = 1.25992 10498 94873 16477 
V3 = 1.73205 08075 68877 29353 
= 1.44224 95703 07408 38232 
In 2 = 0.69314 71805 59945 30942 
In 3 = 1.09861 22886 68109 69140 

y = 0.57721 56649 01532 86061 

In y = -0.54953 93129 81644 82234 
(see Sec. 5.6) 


dxdy = r dr dd 


Series 


1 - X 


= X * w (W < 1) 


m=0 

oo v m 

= Y _ 

n m! 

771=0 

» ( — i 1 

sinx = 2j 

771=0 

00 i \m v 2m 

COS X = X 


m=0 


(2m + 1)! 

(-l) TO x 2 
(2m)! 


In (1 - x) = - X — (W < !) 

m=l “ 


1° = 0.01745 32925 19943 29577 rad 
1 rad = 57.29577 95130 82320 87680° 

= 57°17'44.806" 


arctan x = 


Greek Alphabet 


oo 

X 


(~l) m x 2m+1 
2m + 1 


Vectors 


(W < i) 


a 

Alpha 

P 

Beta 

y.r 

Gamma 

6, A 

Delta 

e, e 

Epsilon 

t 

Zeta 

V 

Eta 

5hP 

0 

Theta 

i 

Iota 

K 

Kappa 

A, A 

Lambda 


Mu 


V 

Nu 

£ 

Xi 

o 

Omicron 

TT 

Pi 

P 

Rho 

a-, 2 

Sigma 

T 

Tau 

v, Y 

Upsilon 

4>, V. ® 

Phi 

A 

Chi 

& ¥ 

Psi 

CO, fl 

Omega 


a»b = fli&i + 02^2 + <*3^3 

i j k 

a x b = «i a 2 a z 

b x b 2 b 2 


, , - , 3/ . df . 3/ 

8rad/ = ' ,/= 


,. _ bv x dv 2 dv 3 

div v = V*y = — r — 1- — — 

dx dy dz 


curl v = V x v = 


i 

d_ 

dx 


j k 

_d_ d_ 

dy dz 

v 2 v z 





